Document Segmentation44

Subsections

One of the greatest appeals of the World-Wide Web is its high connectivity through hyper-links. As we have seen, the LATEX author can provide these links either manually or symbolically. Manual links are more tedious because a URL must be provided by the author for every link, and updated every time the target documents change. Symbolic links are more convenient, because the translator keeps track of the URLs. Earlier releases of LATEX2HTML required the entire document to be processed together if it was to be linked symbolically. However it was easy for large documents to overwhelm the memory capacities of moderate-sized computers. Furthermore, processing time could become prohibitively high, if even a small change required the entire document to be reprocessed.


For these reasons, program segmentation was developed. This feature enables the author to subdivide his document into multiple segments. Each segment can be processed independently by LATEX2HTML . Hypertext links between segments can be made symbolically, with references shared through auxiliary files. If a single segment changes, only that segment needs to be reprocessed (unless a label is changed that another segment requires). Furthermore, the entire document can be processed without modification by LATEX to obtain the printed version.


The top level segment that LATEX reads is called the parent segment.
The others are called child segments.


Document segmentation does require a little more work on the part of the author, who will now have to undertake some of the book-keeping formerly performed by LATEX2HTML . The following four LATEX extensions carry out segmentation:

* \segment{ <file>}{ <sec-type>}{ <heading>}
This command indicates the start of a new program segment. The segment resides in <file>.tex, represents the start of a new LATEX sectional unit of type <sec-type> (e.g., \section, \chapter, etc.) and has a heading of <heading>. (A variation \segment* of this command, is provided for segments that are not to appear in the table of contents.)
These commands perform the following operations in LATEX :
  1. The specified sectioning command is executed.
  2. LATEX will write its section and equation counters into an auxiliary file, named <file>.ptr. It will also write an \htmlhead command to this file. This information will tell LATEX2HTML how to initialise itself for the new document segment.
  3. LATEX will then proceed to input and process the file <file>.tex.
The \segment and \segment* commands are ignored by LATEX2HTML .

* \internal[ <type>]{ <prefix>}
This command directs LATEX2HTML to load inter-segment information of type <type> from the file <prefix> <type>.pl . Each program segment must be associated with a unique filename-prefix, specified either through a command-line option, or through the installation variable $AUTO_PREFIX . The information <type> must be one of the following:
* internals
This is the default type, which need not be given. It specifies that the internal labels from the designated segment are to be input and made available to the current segment.
* contents
The table of contents information from designated segment are to be made available to the current segment.
* sections
Sectioning information is to be read in. Note that the segment containing the table of contents requires both contents and sections information from all other program segments.
* figure
Lists of figures from other segments are to be read.
* table
Lists of tables from other segments are to be read.
* index
Index information from other segments is to be read.
* images
Allows images generated in other segments to be reused with the current segment.

Note: If extensive indexing is to be used, then it is advisable to keep each <prefix> quite short. This is because the hyper-links in the index have text strings constructed from this <prefix>, when using the makeidx package. Having long names with multiply-indexed items results in an extremely inelegant, cumbersome index. See the section on indexing for more details.

* \startdocument
The \begin{document} and \end{document} statements are contained in the parent segment only. It follows that the child segments cannot be processed separately by LATEX without modification. However they can be processed separately by LATEX2HTML , provided it is told where the end of the LATEX preamble is; this is the function of the \startdocument directive. It substitutes for \begin{document} in child segments, but is otherwise ignored by both LATEX and LATEX2HTML .

* \htmlhead{ <sec-type>}{ <heading>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It identifies to LATEX2HTML that the current segment is a LATEX sectional unit of type <sec-type>, with the specified heading.
This command is ignored by LATEX . From version V97.1, it is possible to use this command to insert extra section-headings, for use in the HTML version only.

* \htmlnohead
When placed at the top of the preamble of a document segment, the \htmlnohead command discards everything from the current page that has been placed already. Usually this will be just the section-head, from the \htmlhead command in the .ptr file. Numbering and color information is unaffected.
This allows an alternative heading to be specified, or no heading at all in special circumstances; e.g. the page contains a single large table with a caption.

* \segmentcolor{ <model>}{ <color>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It specifies to LATEX2HTML that text in the document should have the color <color>.

* \segmentpagecolor{ <model>}{ <color>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It specifies to LATEX2HTML that the background of in the document should have the color <color> .

The use of the segmenting commands is best illustrated by the example below. You might want to check your segmented document for consistency using the -unsegment command line option.


Ross Moore
1999-03-26