                          How to Use CNVTXTWP Version 1.1






CNVTXTWP 1.1 (C) Copyright 1993, Jack Guyant.

This program converts a DOS ASCII text file into a WordPerfect 5.1 for DOS document. The
use of program options (described herein) determines the nature of the conversion.

The program is freeware, which means that it may be used for any purpose (private or
commercial) without cost, except for resale. This does not preclude its use on a consulting or
service basis for which you are paid.

The program strictly follows the definition of a minimal WordPerfect document, and it has been
extensively tested on a variety of files. However:

NO LIABILITY OF ANY NATURE IS ASSUMED FOR ANY CLAIM WHICH MIGHT
ARISE FROM THE USE OF THIS PRODUCT.

Please inform this writer as to problems, bugs, and suggestions:

Jack Guyant
CompuServe User ID: 70372,3176

November 21, 1993



WordPerfect is a registered trademark of WordPerfect Corporation.




                          How to Use CNVTXTWP Version 1.1

                                 Table of Contents

OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   1
      What this Program Does . . . . . . . . . . . . . . . . . . . . . . . . .   1
      What this Program Does Not Do. . . . . . . . . . . . . . . . . . . . . .   2
      A Reasonable Request . . . . . . . . . . . . . . . . . . . . . . . . . .   2

NEW IN VERSION 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3

FIRST LOOK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
      Execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
      Taking the Default Options . . . . . . . . . . . . . . . . . . . . . . .   6
      Indenting the First Line of a Paragraph. . . . . . . . . . . . . . . . .   7

CASE STUDIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  12
      Handling Hyphenation and Other Characters. . . . . . . . . . . . . . . .  12
      Full Paragraph Indentation . . . . . . . . . . . . . . . . . . . . . . .  14
      Compress Embedded Spaces to One Space. . . . . . . . . . . . . . . . . .  15
      Compress Embedded Spaces to One [TAB]. . . . . . . . . . . . . . . . . .  17
      ASCII Box Characters into Line Draw. . . . . . . . . . . . . . . . . . .  20
      Do You Really Want a Hard Page?. . . . . . . . . . . . . . . . . . . . .  21

TRAILING SPACES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  23

LAST LOOK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26

ABOUT ASCII TEXT FILES . . . . . . . . . . . . . . . . . . . . . . . . . . . .  28

FUTURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  29
                          How to Use CNVTXTWP Version 1.1


OVERVIEW

What this Program Does

CNVTXTWP is a DOS program which converts an ASCII text file into a WordPerfect 5.1
document according to options set by the user. It is suitable for processing text files which have
extensive paragraph text, such as documentation, engineering specifications, and lengthy
correspondence.

Its features are (hidden codes shown in brackets):

         Remove CR/LF (end of line) in anticipation that WordPerfect will insert [SRt] (soft
          return) codes as it sees fit. Certain options (to be described) override this action, and
          hard returns [HRt] are always inserted to separate paragraphs (two or more CR/LF
          pairs in succession).

         Remove all leading spaces, or, if desired, remove leading spaces such that a first
          sentence space-indented paragraph becomes [TAB] indented, or a paragraph
          completely set off by spaces receives a first-line [Indent] (and other leading spaces
          are removed).

         If desired, replace multiple embedded spaces either with one space or a [TAB] (the
          former being an advantage for a proportionally spaced font and the latter being
          suitable for columns).

         If desired, insert a hard return [HRt] after certain characters, such as = or -. This
          maintains the intent of using certain text characters as dividing lines and is
          particularly helpful in tracking down end-of-line hyphenation in the original text file.

This program retains any tab codes (each text file tab becomes [TAB]).

WordPerfect 5.1 for DOS does give you the choice of converting CR/LF to soft returns within
the hyphenation zone, and WordPerfect 5.2 for Windows provides for converting nearly all
CR/LF pairs to soft returns. There are commercial and shareware programs which offer extensive
conversion facilities.

This program attempts to fill whatever niche may yet exist in this field by offering the user a
variety of options, in order to meet particular needs. There is usually more to be done in the edit
screen before a document is completely converted, and this program will help you with that task.

What this Program Does Not Do

If the original text is largely intended to be single lines, then this program serves no useful
purpose. Indeed, you most likely want WordPerfect to convert each CR/LF to a hard return for
such files or at least that part of a file. (This applies most often to a Table of Contents or a List.)

This program is usually not suitable for processing the output of specialized text editors and word
processors (there are commercial and shareware programs for that purpose). It ignores formatting
characters inserted by a text editor (other than tabs, which it always converts, and page breaks,
if the user wants them included).

As described below (under Execution), the program immediately recognizes an input
WordPerfect Corporation file (and terminates), and it stops upon encountering a NULL (zero)
character (invariably contained within a binary file).

A Reasonable Request

The goal of this program is make your job measurably easier. It most often cannot produce a
"camera ready" document, i.e., one that requires no further processing. However, it can reduce
by a major factor the number of keystrokes needed to complete the conversion.

Consequently, this program can be a real time-saver if You: 1) analyze the text file as to its
characteristics and what you want, and 2) learn how to apply this program's options and
combinations thereof.

Please use WordPerfect's Reveal Codes feature when you are editing the converted document.
There really is no way to effectively manipulate hidden codes without an "internal" view of the
document, and this is what the Reveal Codes window gives you.

There is also an easy way to track hard returns (which you may already know about). In
Setup,Display,Edit-Screen Options, there is Hard Return Display Character. Try using '<'
(without the quotes). Absolutely invaluable.NEW IN VERSION 1.1

         There is now an option to remove trailing spaces. (A trailing space is located between
          the last text character -- other than a space -- on a line and the CR/LF terminating
          that line.) Please see section TRAILING SPACES for details.

         The program displays a Post Conversion Analysis if there are matters to be brought
          to your attention. These are described in section ABOUT ASCII TEXT FILES.

         The wording of the program's options has been changed slightly, e.g., "embedded"
          has replaced "nonleading" (with respect to spaces), and there is clarification of the
          distinction between a first-sentence space indented paragraph and a paragraph which
          is fully indented.FIRST LOOK

Execution

The accompanying README.TXT contains instructions and suggestions for installing
CNVTXTWP.EXE. Please be sure that you have read that file and installed the program.

Until noted otherwise, each of the following command line entries results in an error message,
and the program immediately terminates.

At the DOS command line (the prompt directory name is omitted):

          > REM Enter the program name only:
          > cnvtxtwp

          usage: cnvtxtwp <text_file> <WP_document>

          You will be shown default options, which
          you may change before proceeding.

The input text file is the first argument, and the output WordPerfect document file is the second
argument. (In the remaining command line examples, the usage message is omitted, whereas it
may otherwise appear on the screen.)

          > REM Enter one file name only:
          > cnvtxtwp example.txt
          Invalid number of arguments.

There must be two arguments, and the same error message appears if there are three or more
arguments.

          > REM Enter name for input file which does not exist:
          > cnvtxtwp maybe.txt maybe.doc
          Cannot open file "maybe.txt".

No such file. (Note that the same error message occurs if the input name is that of a directory
which does exist and should occur if you attempt to access a network directory which is
restricted.)

          > REM Enter identical file names:
          > cnvtxtwp same.thg SAME.THG
          Files may not have the same name.

Note that the program, like DOS itself, is not case-sensitive with respect to file names.
(File names such as same.thg and c:\tmp\same.thg are in fact different names, because
the paths are different.)

          > REM Enter as the output file the name of a directory
          > REM which does exist:
          > cnvtxtwp example.txt c:\tmp
          Cannot open "c:\tmp" for output.

No writing to a directory name. (The same thing should happen if you attempt to access a
network directory which is restricted.)

          > REM The input file name is that of a WordPerfect
          > REM document:
          > cnvtxtwp wpcorp.doc example.doc
          File "wpcorp.doc" is a WordPerfect Corporation File!

This error message applies to any file from WordPerfect Corporation -- not just a document file. 

          > REM The input file is either empty or has only a few
          > characters:
          > cnvtxtwp short.txt short.doc
          File "short.txt" is either empty or to small to process.

It would serve no purpose to process an empty file. Also, the program first reads in a set number
of characters, in order to test for the existence of a block of data unique to a WordPerfect
Corporation file (see previous example). The same message can therefore occur when there are
only a few characters of data. You can, of course, Look at the file from within WordPerfect,
in order to see what's really there.

The following example is not a specification error, but it does require your attention.

          > REM The output file (intended Wordperfect document)
          > REM exists:
          > cnvtxtwp example.txt example.doc
          File "example.doc" already exists.
          Do you want to overwrite it? (Y/N): _

You type either Y or N (upper- or lower-case). If you answer Y, then processing continues. If
you answer N, then the program terminates and the intended output file remains intact.

      NOTE:     There are many  opportunities (or requirements) to enter Y,
                N, or Q. The entries can be either upper- or lower-case.
                Also, typing a letter activates response. Do not press the
                Enter key except where so indicated (although there is no
                harm in doing so).

Taking the Default Options

In the following correct command line, example.txt is the input text file to be converted, and
example.doc is the intended WordPerfect document file, which does not already exist:

          > REM Begin a conversion:
          > cnvtxtwp example.txt example.doc

The following screen appears:

Input text file......................: example.txt
Output WordPerfect(R) document file..: example.doc

Current option settings for converting text file to WordPerfect 5.1
document:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : N
  Include hard page breaks ..................................: N

^, ^^ = These options are mutually exclusive within their groups.

Are options correct? (Y = Yes; N = No, reset them; Q = Quit program): 

We will cover all these options and some possible combinations later on.

Note that the option to remove trailing spaces is both the first option and has a default value of
N. See section TRAILING SPACES, in order to judge whether this option may be of help. (It
is not necessary for the following examples.)

Note that the third option refers simply to removing all leading spaces without any indentation.
(A leading space is between the left margin and the first character on that line other than a
space.) This is the only option which has a Y default setting.

      In the following illustrations of text passages to be converted, the use of '|' denotes the
      left margin.

Suppose the original text file has leading spaces (CR/LFs are shown, in order to demonstrate
their role):

                               (continued next page)

|   This is a paragraph such that each line has one orCR/LF
|   more spaces between the left margin and the firstCR/LF
|   text characterCR/LF.

When converting to the WordPerfect document, the original CR/LF line terminators are dropped
(except to delimit paragraphs). WordPerfect will insert soft returns according to its rules. If the
second option were not set to Y (other options being ignored), then every line in this example
would have embedded spaces after WordPerfect reformats the document:

   This is a paragraph such that each line has one or   more spaces
between the left margin and the first   text character.

Type Y to accept these options. A message will appear which indicates that conversion is taking
place. (A rotating odometer advances each time a CR/LF is encountered.) A message will then
appear to indicate that conversion is complete.

The resulting WordPerfect file contains the minimum information required to tell WordPerfect
that this is a bona-fide document. WordPerfect fills in such information as default printer, font,
margins, etc.

(If you retrieve a text file into WordPerfect, then things can actually get a little more
complicated. Assuming that most CR/LF pairs become hard returns, then as you begin to
rearrange text and especially as you insert [TAB]s, WordPerfect inserts more than its usual quota
of deletable soft returns [DSRt]. An interesting misnomer, because they're not deletable by you
and me. It can get rather messy. This writer knows. That's why he wrote this program.)

Your job's not over, but you just saved yourself a bundle of time (in a real-word situation):

|This is a paragraph such that each line has one or more
|spaces between the left margin and the first text character.

Indenting the First Line of a Paragraph

Suppose the text file has the following:

|     This is a paragraph such that the first sentence is
|space indented and the remaining sentences begin at the
|left margin.
|
|     This is the next paragraph of this example; as before,
|this sentence is indented with spaces.

Refer again to the options screen:

                               (continued next page)

Input text file......................: example.txt
Output WordPerfect(R) document file..: example.doc

Current option settings for converting text file to WordPerfect 5.1
document:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : N
  Include hard page breaks ..................................: N

^, ^^ = These options are mutually exclusive within their groups.

Are options correct? (Y = Yes; N = No, reset them; Q = Quit program):

Note that the second through fourth options and the fifth and sixth options are mutually exclusive
within their groups. This means that setting an option of such a group to Y automatically sets
the remaining options or option (if any) of that group to N without making them visible to the
user.

In this example, we want to set the second option to Y, in order to replace the leading spaces
of a first-line paragraph indentation with one or more [TAB] codes. The rule is: One through
four spaces are replaced by one [TAB]. If, for example, there are six indenting spaces, then
there will be two [TAB]'s.

Type N to indicate that you want to reset the options.

The following screen appears:






                               (continued next page)







For each option, type any of Y, N, Q, or press Enter.

  Y     = option takes effect.
  N     = option does not take effect.
  Enter = keep current option setting.
  Q     = quit setting options, and display all options.
          (This does not terminate the program.)

If you answer Y to an option which belongs to a mutually
exclusive group (^ or ^^), then the following (if any)
options of that group are set to N and do not appear here.

When all options are displayed, you can (as before) type N,
in order to reset the option settings and return here.

Option settings follow:

The following option is currently set to: N
  Remove trailing spaces ....................................: _

The first of the options appears at the bottom of the screen. As you set each option, the contents
of the screen scroll upwards. Note in particular that typing the letter Q at this stage does not
terminate the program. It returns to the display of the current option settings (at which stage
typing Q does terminate the program).

In this example, we want to replace the leading spaces which indent the paragraph to [TAB] or
[TAB]s. However, the appropriate option is not yet on the screen.

Let's assume we know for a fact that the input file does not have trailing spaces. We see that the
first option is set by default to N, and we want to keep it that way.  Either type N or, more
easily, press the Enter key, in order to keep the current option setting. (If you're a little heavy
on the Enter key and therefore bypass an option which you wanted to reset, then type Q to return
to the main display. Then type N to return to setting the options, and start over.)

The next option appears (this time, we show only the lower part of the screen):

Option settings follow:

The following option is currently set to: N
  Remove trailing spaces ....................................: N

The following option is currently set to: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: _

(The program confirms that the first option is still set to N, whether you typed the letter or
pressed the Enter key.)

The second option is the one we want. Type Y.

The lower part of the screen now appears as follows:

Option settings follow:

The following option is currently set to: N
  Remove trailing spaces ....................................: N

The following option is currently set to: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: Y

The following option is currently set to: N
  Compress multiple embedded spaces to one space ^^..........: _

(Note that the remaining two options of the first group of mutually exclusive options did not
appear.)

For this example, we're not concerned with the remaining options. Type Q, and we go back to
the first screen, which displays the (now) current options:

Input text file......................: example.txt
Output WordPerfect(R) document file..: example.doc

Current option settings for converting text file to WordPerfect 5.1
document:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: Y
  Remove all leading spaces (not [Indent] paragraph) ^.......: N
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : N
  Include hard page breaks ..................................: N

^, ^^ = These options are mutually exclusive within their groups.

Are options correct? (Y = Yes; N = No, reset them; Q = Quit program):

As stated in the previous screen, you could at this stage type N, in order to return to resetting
options in case you change your mind (of course we don't make mistakes, but if...). However,
this is what we want. Type Y to launch the conversion.

To return to our example, this is what we get within WordPerfect (or something like it) with
Reveal Codes on:

[TAB]This is a paragraph such that the first sentence is space[SRt]
indented and the remaining sentences begin at the left margin.[HRt]
[HRt]
[TAB]This is the next paragraph of this example; as before,[SRt]
this sentence is indented with spaces.[HRt]
[HRt]

The program inserted the [HRt] codes, because two CR/LF pairs (in the text file) indicate a
paragraph in the human sense.

Is all well? Not necessarily. The beginning of a business letter might be:

|November 21, 1993
|
|The XYZ Company
|1234 Street, Suite 101
|Anytown, ST  99999
|
|Dear Sirs:
|
|     We would like to call to your attention ...

The program cannot know that the addressee block consists of what should be three WordPerfect
paragraphs. Consequently, we will get:

The XYZ Company1234 Street, Suite 101Anytown, ST  99999

Dear Sirs:

     We would like to bring to your attention ...

In this example and if nothing else is awry, you need only enter two [HRt] codes to straighten
out the addressee block.

This concludes the first look at CNVTXTWP. In section CASE STUDIES, we examine the use
of the program's options for more complex formatting requirements.CASE STUDIES

The order of presentation of the following examples purports to meet real-world needs rather
than the order in which this program's options appear on the screen. We will not cover every
theoretically possible combination, but these examples should address most situations.

Handling Hyphenation and Other Characters

Text files can have end-of-line hyphenation. Consider the following:

|This text probably emerged from a word processor such that hy-
|phenation was turned on. The following will probably happen:
|
|Assuming that CR/LF's go away (except to separate paragraphs),
|the word hy-phenation will appear thusly within the converted
|text.
|
|As the reader knows, even if the printer happens to be bi-
|directional, the same thing will occur.

Bit of a sticky wicket. We don't want "hy-phenation." However, we probably want to retain
"bi-directional."

Taking the default options, the preceding example was massaged by the program and came out
as follows:

This text probably emerged from a word processor such that hy-
phenation was turned on. The following will probably happen:

Assuming that CR/LF's go away (except to separate paragraphs), the
word hy-phenation will appear thusly within the converted text.

As the reader knows, even if the printer happens to be bi-directional,
the same thing will occur.

(Be advised that this program converts an ASCII '-' into a hyphen character, which appears in
the Reveal Codes window as: [-].)

Let's turn to the option:

  Hard return after end of line /, \, |, #, *, _, =, +, or - : _

If the truth be known, this option was implemented for a reason unrelated to the current issue
of embedded vs. nonembedded hyphens. As it turns out, the side-effect of this option may be the
more important, because it can help us solve the current problem, which is to rid ourselves of
unwanted hyphenation. (We'll get to the other reason in a moment.)

Make one change to the program's default options by typing N to change the options and typing
Y for this option. (The fact that leading spaces are being removed does no good for this
example, but it does no harm either.) Run the current text file example through the program
again. The result within WordPerfect is:

This text probably emerged from a word processor such that hy-
phenation was turned on. The following will probably happen:

Assuming that CR/LF's go away (except to separate paragraphs), the
word hy-phenation will appear thusly within the converted text.

As the reader knows, even if the printer happens to be bi-
directional, the same thing will occur.

View the first and fifth lines (not counting blank lines) through the Reveal Codes window:

This text probably emerged from a word processor such that hy[-][HRt]

and

As the reader knows, even if the printer happens to be bi[-][HRt]

(There is still "hy-phenation" within the text, but that's how we planned it at the outset.)

The next step is up to the user. At the top of the document, press Alt+F2 (Replace). Type Y
to confirm replacement. Type '-' (without the quotes) to get [-], and press Enter to get [HRt].
Press F2 for the replacement string, and immediately press F2 to indicate nothing (questionable
English, good WordPerfect). Answer Y to replace end-of-line hyphenation (only where you
really want to remove the hyphen).

Now make a second sweep. This time leave the [-], but remove the [HRt].

The next example was the rationale for this option in the first place:

Engineering specification documents may have banner-like uses of '*', '=', etc. For example:


                               (continued next page)




|****************************************************************
|                  SPECIFICATIONS FOR PROJECT XXX
|****************************************************************

Of course, each line is terminated by CR/LF in the text file. And, of course, without
corresponding hard returns, these three lines appear on one line (or until a [SRt] take hold). This
is not as big a deal as the pesky hyphens just described; there usually aren't too many of these.

Nonetheless, the option just described will handle this circumstance up to a point (which, as
alluded to above, is why it was implemented). It still won't come out exactly right (trying it out
is left as an exercise to the user), but you won't lose your bearings entirely.

ASCII (as opposed to graphic) line draw most often uses the characters indicated in this option.
Let's take another look at the screen display of the option:

  Hard return after end of line /, \, |, #, *, _, =, +, or - : _

Something like

|         +----------------+
|         |                |
|         |                |
|         +----------------+

comes through intact (unless you compress multiple embedded spaces).

Business letters in ASCII text often use the underscore for the signature block:

|Sincerely,
|
|
|___________________________________
|A. B. Smith, Vice-President

Full Paragraph Indentation

Consider the following text file passage:

|1.2.3     Paragraph Blocks
|
|          This section deals with the need to consider paragraphs
|          which are fully indented from the left margin. In the
|          WordPerfect sense, this means that the first line of
|          this paragraph starts with the [Indent] code, and
|          the rest of the paragraph lines up accordingly.

If we take the default options, which leads to removing all leading spaces, then we get:

1.2.3     Paragraph Blocks

This section deals with the need to consider paragraphs which are
fully indented from the left margin. In the WordPerfect sense, this
means that the first line of this paragraph starts with the [Indent]
code, and the rest of the paragraph lines up accordingly.

This has done something for us, because we can pop in an [Indent] with one tap of the Indent
key (F4 in DOS; F7 in Windows CUA) just before the word This.

      This section deals with the need to consider paragraphs which are
      fully indented from the left margin. In the WordPerfect sense,
      this means that the first line of this paragraph starts with the
      [Indent] code, and the rest of the paragraph lines up
      accordingly.

We completed this conversion by blocking and bolding [Indent].

Let's look at the third and fourth options (with their default settings):

  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N

This time around, type N to the first of these options and then Y to the second of these options.
Type Q to return to the display of options and then Y to start the conversion. We get:

1.2.3     Paragraph Blocks

      This section deals with the need to consider paragraphs which are
      fully indented from the left margin. In the WordPerfect sense,
      this means that the first line of this paragraph starts with the
      [Indent] code, and the rest of the paragraph lines up
      accordingly.

If you examine this through the Reveal Codes window, then you will see that there is a [Indent]
code at the beginning of the paragraph block, and the remaining lines are left-justified.

This program does not currently address handling a section heading, such as the legal paragraph
numbering used in this example (see section FUTURES for a possibility), and there are still
spaces between the paragraph number and the heading.

Compress Embedded Spaces to One Space

This is an example of multiple embedded spaces (with some embellishments thrown in):

|     This   is  an     example   of  text   with  em-
|bedded      spaces  within      the  body  of  a text              
|file.   Note  that  there  is  also  an   end-of-line
|hyphen.

We direct our primary attention to the multiple embedded spaces, which can result from full
indentation of a monospaced font (usually Courier). We also note that there is hyphenation.
Finally, the first sentence of the paragraph is indented (which we assume to be spaces).

Let's look again at the available options and determine which ones can help us:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : N
  Include hard page breaks ..................................: N

The second option (as described earlier) applies to replacing the leading spaces of the first
sentence of our example with at least one [TAB]. (Saying Y to this option eliminates the
remaining two options of that group, which is O.K., because they don't apply here.)

The fifth option is what we're after, because our goal is to compress the embedded spaces.

The seventh (next to the last) option can help us, because there is an end-of-line hyphenation.

Typing Y to the appropriate options gives us:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: Y
  Remove all leading spaces (not [Indent] paragraph) ^.......: N
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: Y
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : Y
  Include hard page breaks ..................................: N

^, ^^ = These options are mutually exclusive within their groups.

Are options correct? (Y = Yes; N = No, reset them; Q = Quit program):

Type Y to accept these options, and retrieve the document into WordPerfect. We get:

      This is an example of text with em-
bedded spaces within the body of a text file. Note that there is also
an end-of-line hyphen.

Place the cursor on the end-of-line hyphen, and press Delete twice (to delete both the hyphen
and the [HRt]). This gives us:

      This is an example of text with embedded spaces within the body
of a text file. Note that there is also an end-of-line hyphen.

There's a potential downside to this. If you intend to use a fixed (monospaced) font such as
Courier, then you may insist on having two spaces after each period (terminating a sentence).
Of course, it's practically a flick of the wrist to globally change such occurrences to a period and
two spaces. Then (what else?) there will be instances of needing a period and one space.

This program could have been designed not to overly compress period space combinations, but
that would have introduced the mirror-image problem for proportionally spaced fonts.

As stated at the outset of this document, you have to be the judge.

Compress Embedded Spaces to One [TAB]

Statistical documents have tables. This, of course, means "tables" in the human sense of columns
-- not necessarily the WordPerfect Tables feature. Suppose we have the following text file table:

|Table 1.2 summarizes the sales figures for the first quarter of the
|year (in dollars):
|
|                   Jan          Feb         Mar
|                   ---          ---         ---
|      Item A     1,234          400         950
|      Item B       456        1,200         780
|                 =====        =====       =====
|      Total      1,690        1,680       1,730

All separators are spaces. There are no multiple spaces within the explanatory text. Hyphens (or
dashes) and the equal sign are used for caption and totalling purposes. There are leading spaces.

This program makes an assumption about compression and CR/LF. If the option to compress
multiple embedded spaces to one space is set to Y, then the text file's CR/LF pair is treated as
being located within normal text and is omitted, except as described throughout the previous
examples.

If, however, a CR/LF pair terminates a line for which there has been compression to a [TAB],
then the program treats this line as part of a table and inserts a [HRt].

Let's again look at the options with their default settings:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : N
  Include hard page breaks ..................................: N

We don't need the second option, because the explanatory text is not first-line space indented.
We gain something by using either the third or fourth option (we can't have both). The third
option would at least remove the tedium of going through the converted document and manually
removing leading spaces in anticipation of inserting [TAB]'s or [Indents]. The fourth option
would do some work for us. We want the sixth option, because that's what this example is about.
We also want the seventh option, because that will put a [HRt] after the end-of-line '-' or '='.

Let's go for broke with the following option settings, and see what happens:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: N
  Remove all leading spaces and [Indent] paragraph ^.........: Y
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: Y
  Hard return after end of line /, \, |, #, *, _, =, +, or - : Y
  Include hard page breaks ..................................: N

^, ^^ = These options are mutually exclusive within their groups.

Are options correct? (Y = Yes; N = No, reset them; Q = Quit program):

Anchors aweigh, and bring it into WordPerfect:

Table 1.2 summarizes the sales figures for the first quarter of the
year (in dollars):

      Jan    Feb   Mar
      ---    ---   ---
      Item A       1,234        400   950
      Item B       456   1,200        780
      =====        =====        =====
      Total        1,690        1,680       1,730

There is a default tab setting just before the converted example. There are leading [Indent]s and
embedded single [TAB]s (which you can confirm by viewing this document within
WordPerfect).

If setting tabs is second nature to your working with WordPerfect, then skip over the following
advice. Otherwise, lend an ear.

In the next to the last row (containing '====='s), insert a [TAB] between the [Indent] and the
first '='. Position the cursor to just after the right-most '-', and head for Tab Set
(Format,Line,Tab Set). Home,Home,Left-arrow. Ctrl+End. Now we have a clean slate. Place
the cursor to where you want the line captions ("Item A" ... ) to begin and press L. Move the
cursor to the right, and press D for Decimal. Adjust this setting with Ctrl+Left-arrow or
Ctrl+Right-arrow (in WordPerfect for Windows, you can return to the edit screen and use the
mouse). Do the same for the next two columns. Exit (F7) your way back to the edit screen.  

Now move the cursor to just before the [Indent] preceding "Jan". As before, get to the tab ruler,
clear the decks, and set L, R, or C for each column title, adjusting the settings to your
satisfaction.

One possible result follows:

Table 1.2 summarizes the sales figures for the first quarter of the
year (in dollars):

                      Jan         Feb          Mar
                      ---         ---          ---
      Item A        1,234         400          950
      Item B          456       1,200          780
                    =====       =====        =====
      Total         1,690       1,680        1,730

If you really don't want tabbed tables -- you're just trying to get something printed, then say N
to all options (other needs being equal or ignored). Then go through the table inserting hard
returns, i.e., work the Enter key. This is by no means a kludge. It may be just what you want.

There is, alas, a major downside to all this. Most text files which this writer has encountered
follow the typewriter practice of '.  ' -- meaning a period and two spaces. Using the approach
just described, the resulting document elsewhere is riddled with [TAB] codes. And, of course,
if the text was originally full justified such that we now have varying numbers of embedded
spaces, then the situation has gone from difficult to hopeless. Now what?

As you know, you can separate the original text file by bringing it into WordPerfect, using the
Doc 1 and Doc 2 screens to cut the file into sections, and then saving the sections as DOS Text
files (Ctrl+F5,T,S). Hint: When you're doing this kind of work in a particular subdirectory
which may not be (and maybe shouldn't be) your usual working directory, pressing F5,= lets
you change your default directory. This way, you can create new file names without having to
type in a full path name or wonder later on where the files went. (Many users know about this,
but apparently few take advantage of it.) Of course, you have to remember to reset the default
directory when the need arises.

Now run the sections through the program with the appropriate options. Then bring each section
into WordPerfect (easy to find with F5, because you set the default directory) -- assuring
WordPerfect each time that, yes, you really do want to retrieve the file into the current
document.


ASCII Box Characters into Line Draw

The following appears in a text file:

|Here is one way to express the relationship:
|
|   ͻ       ͻ
|                   R                 
|      Entity     >    Entity     
|                                     
|   ͼ       ͼ

Set this option to Y:

  Remove all leading spaces and [Indent] paragraph ^.........: Y

(Remember to set the preceding options of this group to N.)

This gives us within the WordPerfect document:

Here is one way to express the relationship:

      ͻ       ͻ
                      R                 
         Entity     >    Entity     
                                        
      ͼ       ͼ

Note that you did not set any option to obtain Line Draw. (We set the option to provide a
leading [Indent], in order to get an offset.)

For the technically inclined, the program recognizes the range of ASCII characters from decimal
176 through 223 (or, in programming terms, B0h through DFh). The program always inserts a
[HRt] in place of the CR/LF combination when there is at least one box character on that line.

Furthermore, if either of the compression options is in effect, then that option does its thing only
for those lines not containing at least one box character (this is not demonstrated here, but it has
been tested.) You can on this score at least have your cake and eat it too.

If the original file uses tabs rather than spaces (not likely but possible,) within the Line Draw
range, then you may have some work to do at the edit screen, because the program always
converts an ASCII tab to [TAB].

Do You Really Want a Hard Page?

On the one hand, we really don't want to sprinkle hard page breaks [HPg] throughout a
document. Obviously, you need them for Table of Contents, Lists, Endnotes, etc., as well as for
chapter breaks (as was done in this document), but that's about it).

On the other hand, they might be just the ticket to help you isolate page breaks as they appeared
in the original text file, which is to say: track them down with Search (F2). Text files with page
breaks customarily have page lengths of about 66 lines with page numbers, and you don't want
these in their original form in your final WordPerfect document.

Here is an extract from a documentation text file (slightly compressed):

|XYZ User's Guide                                         Page 3
|^LGetting Started 
|_______________________________________________________________
|
|Introduction 
|

The ^L (Ctrl+L) is what you would see within a text editor. (WordPerfect on its own converts
^L to [HPg], and we don't want that for this description.)

The appropriate (and last) option is (with its default setting):

  Include hard page breaks ..................................: N

The options for processing the text file were set as follows:

  Remove trailing spaces ....................................: N
  Replace 1st line paragraph indenting spaces with tab(s) ^..: N
  Remove all leading spaces (not [Indent] paragraph) ^.......: Y
  Remove all leading spaces and [Indent] paragraph ^.........: N
  Compress multiple embedded spaces to one space ^^..........: N
  Compress multiple embedded spaces to one tab ^^............: N
  Hard return after end of line /, \, |, #, *, _, =, +, or - : Y
  Include hard page breaks ..................................: Y

The result in WordPerfect is:

XYZ User's Guide                                         Page 3
===============================================================
Getting Started
_______________________________________________________________

Introduction 

We're still simulating this example to some extent, because there was actually a [HPg] after the
digit 3 which as you know produces the double-dashed line (here, '='s) in a true setting.

Simulation notwithstanding, the outcome is the real McCoy, and we now want to remove the
[HPg]s. (Actually, we also want to remove the text file footer and possibly replace it with a
WordPerfect footer).

This concludes the case studies. The next section deals with the issue of trailing spaces.TRAILING SPACES

A trailing space in a text file is one that is located between the last text character (other than a
space) on a line and the CR/LF pair which terminates that line. For example:

This brief paragraph demonstrates the effect <
of spaces separating words at the CR/LF pair <
terminating a line in a text file.<

(The '<' characters were inserted in the edit screen by WordPerfect to indicate hard returns.
They wouldn't of course appear in a document.)

Note that each of the first two lines has a trailing space.

The program converts this text with the default options, and the result within WordPerfect is:

This brief paragraph demonstrates the effect of spaces separating
words at the CR/LF pair terminating a line in a text file.<

Here is the same text file passage with the trailing spaces removed:

This brief paragraph demonstrates the effect<
of spaces separating words at the CR/LF pair<
terminating a line in a text file.<

Run this text though the program with the default options and bring it into WordPerfect:

This brief paragraph demonstrates the effect of spaces separating
words at the CR/LF pair terminating a line in a text file.<

There is no difference between the two paragraphs within WordPerfect, and this leads to the first
point: When the program elects to drop a CR/LF pair, it determines whether there is a space
directly before the CR/LF. If that is the case, then processing continues (having dropped the
CR/LF). However, if there is no space just before the CR/LF, then the program writes a space
before continuing; otherwise, the last word of a line and the first word of the next line would
most likely run together. (There's no way to predict how WordPerfect will insert [SRt]s.)

If we can describe a text file as being "in good form," then it is fair to say it should not have
trailing spaces. In a text file, the CR/LF by definition serves to separate words. (By analogy, go
through a multi-line paragraph within WordPerfect in the Reveal Codes window, and count how
many times there's a space before the [SRt]).

Now to the second point.

Here is the same text file passage with multiple trailing spaces:

This brief paragraph demonstrates the effect   <
of spaces separating words at the CR/LF   <
terminating a line in a text file.<

Convert this text with the default options, and retrieve it into WordPerfect:

This brief paragraph demonstrates the effect   of spaces separating
words at the CR/LF   terminating a line in a text file.<

Clearly, we've got spaces where we don't want them.

Recall that the program's first option is:

  Remove trailing spaces ....................................: N

With this option set to N, a text file with one or more trailing spaces on a line causes the
program to display this message at the completion of conversion:

  CNVTXTWP 1.1 Post Conversion Advisory:

  The program detected at least one occurrence of a space
  followed by a CR/LF pair.

  There is nothing inherently wrong about this, but you may
  want to consider running the program on this file with the
  "Remove trailing spaces" option set to Y, if it appears
  that the WordPerfect document has unexpected multiple spaces
  between words.

This is not an error condition. If the document looks all right within WordPerfect, then nothing's
amiss. However, if there appear to be extra spaces, then this may be the reason.

Set the first option to Y (with the other options at their default values), and run the program on
the text example just given (with multiple trailing spaces):

This brief paragraph demonstrates the effect of spaces separating
words at the CR/LF terminating a line in a text file.<

All three acceptable results are identical within WordPerfect, and that is what we want.

(Another reason for having multiple embedded spaces -- and the more likely one -- is that the
text file has leading spaces for paragraph offset, and you did not specify either removing the
leading spaces entirely or offsetting such paragraphs with the [Indent] code.)

A number of files used for developing and testing this program had one trailing space on most
lines. None had multiple trailing spaces. However, the output of a text editor (unlike that of a
word processor) ordinarily does not remove trailing spaces; so it can happen.

This concludes our topic on trailing spaces. The next section offers some suggestions on
converting a text file into a WordPerfect document.LAST LOOK

This writer would like to take the liberty of making the following recommendations, which you
may find helpful (and depending on how much file conversion you've done or plan to do):

         Recall that this program does not alter the input text file. This is important, because
          you need this file, in order to check your conversion against the original format (and
          altering a user's original file is against the law).

          You should have a printout of the file. You could of course import the file directly
          into WordPerfect and then print it, but, based on this writer's experience, you may
          find it hard to read. (For one thing, the default margin settings almost guarantee in
          many instances that there will be text wraparound, and you can't set the L/R margins
          to 0 and 0.)

          Therefore, unless you already have a hard copy, print the file from the DOS
          command line with the PRINT command. Presuming that you do have a bona-fide
          text file, this will give you an exact image of the file with which to work.

          If you're familiar with using the PRINT command, this concludes this
          recommendation. If not, read on.

          The PRINT command is a TSR (Terminate and Stay Ready), which is a backwards
          description, because the program first stays ready and then terminates. In any event,
          this means that a portion of the program remains in memory. Do not execute this
          command -- at least the first time -- from within WordPerfect (or from within any
          other program which can "shell out" to DOS). This will confuse the active program
          no end with respect to what it thought it had for available memory.

          Once executed, the resident part of the PRINT command occupies only about 6K
          bytes, which in itself shouldn't reduce available memory all that much. However, if
          you process very large documents, be aware that WordPerfect 5.1 for DOS works in
          the 640KB of conventional memory (although it can make use of expanded memory).
          You may want to reboot the system, in order to flush out the TSR (unless your
          system is a critical part of a network). Use the Reset button, if one's available.
          Otherwise, Ctrl+Alt+Del. (And, or course, you may have other TSRs at work,
          especially starting with DOS 5.0)

         Being a WordPerfect user, you're a veteran of the Replace (Alt+F2) feature. Still,
          it's easy to forget that it can do a lot of work for you whether or not you're
          converting files. (We've already mentioned its use earlier). Two requests: 1) Save
          your work with Save (F10) before making any global changes, and 2) Answer Yes
          to the w/Confirm? question, unless you're absolutely sure that you don't need to.

          This comes in handy for text files which have paragraphs offset with an asterisk, dash
          (hyphen), or possibly the letter o (in lieu of a bullet). Specify that you want the
          program to replace leading spaces with the [Indent] if the text "bullet" itself is offset
          with leading spaces. Within WordPerfect, change the "bullet" and following spaces
          to an appropriate substitute followed by [Indent]. The result may not be precisely the
          same as the original, but it will be more suitable for the WordPerfect environment.

This concludes some basic recommendations. The next section is for the user who wants (or
needs) to known some more about the properties of a text file.ABOUT ASCII TEXT FILES

Formally stated (more or less), a DOS text file contains characters in displayable format. ASCII
is an acronym for the committee which describes the character encoding scheme in technical
terms. Each line (except possibly the last) is terminated with a CR/LF (Carriage Return/Line
Feed) pair. The file itself is terminated with a ^Z (Ctrl+Z), which signifies an end-of-file
condition (this is a DOS, rather than ASCII, characteristic).

Displayable characters such as faces happy or otherwise do not necessarily disrupt this definition
-- although they stretch the point. Such characters on the PC are special instances of what are
called control characters. Among the control characters, this program processes only CR, LF,
and FF (Form Feed), the latter of which corresponds to the WordPerfect hard page. In addition
to text and selected control characters, the program processes only ASCII box characters, which
correspond to WordPerfect Line Draw (as demonstrated earlier). The program detects and then
drops all other characters.

The program stops (with a message) if it encounters either a zero length line or a NULL (zero)
character. It is certain that either the input file has been corrupted (not likely), or (far more
likely) you inadvertently specified a binary file (executable program, spreadsheet, etc.).
A binary encoded character is distinctly not a text character, because it has "meaning" only to
a computer program -- not the user.

This program only translates the CR (Carriage Return) and LF (Line Feed) characters if they
appear in the text file as CR/LF pairs, because this is what mimics the behavior of a manual
typewriter (there used to be such things). 

The program tells you in a Post Conversion Advisory if it detected a "standalone" CR or LF.
(A standalone character in this context is one that should occur as one of a pair.) A standalone
character is ignored -- rather than being passed on to the WordPerfect document. The presence
of a standalone CR or LF is not an error condition, but it does indicate that the input file is
probably not a text file as strictly defined.

If the program did not detect ^Z (Ctrl+Z) in the input file, then the input file is not a DOS
ASCII text file. The program so informs you in the Post Conversion Advisory. As with the
standalone characters, this is not an error condition, but it should be brought to your attention,
in case things in the WordPerfect document cause you to question the nature of the original file.

On balance, the input file must be either a full-fledged text file or a very close approximation.

This concludes our summary of text files. The next section describes some possible enhancements
to the program.FUTURES

The following come to mind:

         The program could recognize numeric paragraph (or section) numbers 1, 1.1,
          1.1.1, etc. and convert them into Automatic Paragraph Numbers (the user would
          have to provide the definition). This would include replacing following spaces (or
          tabs) with one [Indent] if the spaces are followed by a text character other than
          another space or CR/LF.

         It would be convenient to have a CNVTXTWP.SET file indicated by a command line
          option, such as: /S-C:\<dirname>\CNVTXTWP.SET, or a default to the current
          directory. It would then not be necessary to reset the default options, if you have a
          preferred standard setting. (There is not a panoply of command line switches, because
          this program is directed more to an office environment than to a programming shop
          -- the latter being amenable to them; and the former, adverse.)

         The convenience of a dialog box is obvious. All the better if it were done in
          Windows (although it remains to be seen if DOS text mode boxes are preferable).

It all depends on the feedback (and this writer's time). If this program meets a need, then it's
worth the effort. If not, then it will have one user who saves a lot of time.




