2 - Guidelines for HTML Source Files
HTMLDOC is capable of processing most HTML files. This chapter discusses
the requirements for HTML files to be correctly processed by HTMLDOC
General Requirements
Since HTMLDOC is designed as a documentation generation program,
it expects to have chapters and headers.
NOTE: If you are converting a generic web page you must
select the "--webpage" option on the command-line or choose "Web Page"
as the document type in the GUI.
Page Breaks
To force a page break use the HR markup with the option "BREAK":
<HR BREAK>
Chapters
All chapters start with a top-level heading (H1) markup. Any headings within
a chapter must be of a lower level (H2 to H7). Each chapter starts a new
page (the next odd-numbered page if duplexing is selected.)
Headings
The headings you use within a chapter must start at level 2 (H2). If you
skip levels the heading will be shown under the last level that was known.
For example, if you use the following hierarchy of headings:
<H1>Chapter Heading</H1>
...
<H2>Section Heading 1</H2>
...
<H2>Section Heading 2</H2>
...
<H3>Sub-Section Heading 1</H3>
...
<H4>Sub-Sub-Section Heading 1</H4>
...
<H4>Sub-Sub-Section Heading 2</H4>
...
<H3>Sub-Section Heading 2</H3>
...
<H2>Section Heading 3</H2>
...
<H4>Sub-Sub-Section Heading 3</H4>
...
the table-of-contents that is generated will show:
Chapter Heading
- Section Heading 1
- Section Heading 2
- Sub-Section Heading 1
- Sub-Sub-Section Heading 1
- Sub-Sub-Section Heading 2
- Sub-Section Heading 2
- Sub-Sub-Section Heading 3
- Section Heading 3
Unsupported or Restricted HTML Features
The following HTML features are either not supported or have limited support
in this release of HTMLDOC.
Embedded Objects
Only embedded HTML files are supported using the EMBED tag.
Fonts
Limited typeface specification is currently supported. The "Arial" typeface
is mapped to "Helvetica" to ensure portability across platforms and for
older PostScript printers. All other unrecognized typefaces are silently
ignored.
Forms
Forms are not yet supported when generating PostScript and PDF files.
Frames
HTMLDOC does not support frames.
Image Maps
Image maps are not exported to HTML or PDF files.
Links
External URL links are fully supported for HTML and PDF output. Internal
links are supported in HTML and PDF output.
When generating PDF files, links of the form:
<A HREF="file:filename.pdf">Link Text</A>
will be converted to external file links for the PDF viewer instead of
URL links.
Scripts and Applets
All scripts and applets are silently stripped from the output.
Style Sheets
Style sheets are not yet supported.
Tables
Currently only the HTML 3.2 varient of tables is supported. The CAPTION,
THEAD, TFOOT, and TBODY tags are ignored.