4 - Generating Documents from the Command-Line

This chapter describes how to generate one or more document files from a given set of HTML "source" files using the HTMLDOC software from the command-line. If you are converting web pages from HTML to PostScript or PDF format, be sure to look at the Converting Web Pages section.

Generating a Single File

To generate a single file containing the entire document, type the following: The "outfile.html", "outfile.pdf", and "outfile.ps" arguments are the desired output file. "infile1.html", "infile2.html", etc. are your HTML source files.

By default HTMLDOC looks at the extension of the output file to determine the output format. Files ending in ".ps" select Level 2 Adobe® PostScriptTM output. For Level 1 PostScript see "Forcing Level 1 Output" below.

Generating Multiple Files

To generate multiple files for the document, type the following: The "outdir" argument is the desired output directory. The "-t html", "-t ps1", and "-t ps2" arguments select HTML and PostScript output, respectively. "infile1.html", "infile2.html", etc. are your HTML source files. A separate HTML or PostScript file (doc####.html or doc####.ps) will be created for each chapter (H1 heading) in the document as well as a table of contents file (index.html or index.ps). For HTML output, all local image files that are referenced in the document will be copied to the output directory as well.

Multiple output files are currently not supported for PDF output.

General Options

The following options apply to all output formats.

Numbering the Headings

Some types of documents require paragraph/heading numbers. To enable automatic heading numbering use the "--numbered" option:

Adding a Logo Image

The logo image is optionally displayed in the page heading of PostScript and PDF output and at the top of the navigation bar along the left side of the page of HTML output. To include a document "logo" use the "--logo" option to HTMLDOC: The logo file can be of any supported image file type (GIF, JPEG, PNG).

Adding a Title Image

The title image is displayed on the title page. To include a title page image use the "--title" option to HTMLDOC: The logo file can be of any supported image file type (GIF, JPEG, PNG).

Converting Web Pages

To convert unstructured HTML documents such as web pages, use the "--webpage" option to HTMLDOC: This is equivalent to using the "--no-title" and "--no-toc" options.

Setting the Table of Contents Depth

To set the number of heading levels to show in the table-of-contents use the "--toclevels" option to HTMLDOC: The default depth is three levels (H1 to H3). To turn the table of contents off, use the "--no-toc" option:

Disabling the Title Page

The title page is normally generated for all HTML, PostScript, and PDF output. To turn the title page off use the "--no-title" option:

Changing the Body (Background) Color

Use the "--bodycolor" option to change the background color: The color can be any primary color (black, red, green, yellow, blue, magenta, cyan, or white) or a specific red-green-blue value.

Changing the Body (Background) Image

Use the "--bodyimage" option to change the background image: The image file can be any PNG, GIF, or JPEG image.

HTML-Specific Options

The following options apply to HTML output.

Changing the Navigation Bar Color

Use the "--barcolor" option to match the navigation bar color to your logo image: The color can be any primary color (black, red, green, yellow, blue, magenta, cyan, or white) or a specific red-green-blue value.

PostScript-Specific Options

The following options apply to PostScript output.

Forcing Grayscale Output

To force all output to be in grayscale use the "--gray" option: This option is necessary for all B&W Level 1 PostScript printers.

Using JPEG Compression

To use JPEG compression for large images use the "--jpeg" option: The default JPEG quality is 90; to set a different quality use: where quality is the standard JPEG quality level from 1 to 100.

JPEG compression is not available on Level 1 PostScript printers.

Forcing Level 1 Output

To force Level 1 PostScript output use the "-t ps1" option: This option is necessary for all Level 1 PostScript printers.

Requesting Double-Sided Output

The "--duplex" option specifies double-sided output: Note that this does not select duplexing on the printer but merely adjusts the formatting so that the left & right margins are swapped on the back side and chapters start on an odd-numbered page. You must still select duplexing in your printer driver or on the printer itself.

Setting the Page Size

The "--size" option specifies the output page size: The "WIDTH" and "HEIGHT" arguments can be in points (no units specified), inches, centimeters, or millimeters. The default page size is Universal (8.27x11in or 210x279mm) which is the minimum of the US and European standard sizes (Letter and A4, respectively).

Note that this does not select a media size on the printer but merely adjusts the formatting so that the text and images appear within the given page area. You must still select the appropriate media size in your printer driver or on the printer itself.

Setting the Page Margins

The "--left", "--right", "--top", and "--bottom" options control the page margins of the output. The defaults are 1 inch (25mm) for the left and 0.5 inches (12mm) for the right, top, and bottom margins.

Setting the Default Font Typeface and Size

The default font size, spacing, and typefaces are controlled by the "--fontsize", "--fontspacing", "--bodyfont", and "--headingfont" options: The typefaces for "--bodyfont" and "--headingfont" can be "courier", "times", or "helvetica".

Customizing the Page Headers and Footers

The "--header" and "--footer" options allow you to customize the headers and footers used for the document body. Each option requires a three character string that specifies the left, middle, and right fields:
Char Description
. A period indicates that the field should be blank.
t A "t" indicates that the field should contain the document title.
h An "h" indicates that the field should contain the current heading.
l A lowercase L indicates that the field should contain the logo image.
1 The number 1 indicates that the field should contain the current page number in decimal format (1, 2, 3, ...)
i A lowercase I indicates that the field should contain the current page number in lowercase roman numerals (i, ii, iii, ...)
I An uppercase I indicates that the field should contain the current page number in uppercase roman numerals (I, II, III, ...)
The "--tocheader" and "--tocfooter" options control the header and footer on table-of-contents pages.

Setting the Header and Footer Font

The "--headfootsize" and "--headfootfont" options set the size and typeface of the font used for the page headers and footers:

PDF-Specific Options

The following options apply to PDF output.

Forcing Grayscale Output

To force all output to be in grayscale use the "--gray" option:

Using JPEG Compression

To use JPEG compression for large images use the "--jpeg" option: The default JPEG quality is 90; to set a different quality use: where quality is the standard JPEG quality level from 1 to 100.

Requesting Double-Sided Output

The "--duplex" option specifies double-sided output: Note that this does not select duplexing on the printer but merely adjusts the formatting so that the left & right margins are swapped on the back side and chapters start on an odd-numbered page. You must still select duplexing in your printer driver or on the printer itself.

Setting the Page Size

The "--size" option specifies the output page size: The "WIDTH" and "HEIGHT" arguments can be in points (no units specified), inches, centimeters, or millimeters. The default page size is Universal (8.27x11in or 210x279mm) which is the minimum of the US and European standard sizes (Letter and A4, respectively).

Note that this does not select a media size on the printer but merely adjusts the formatting so that the text and images appear within the given page area. You must still select the appropriate media size in your printer driver or on the printer itself.

Setting the Page Margins

The "--left", "--right", "--top", and "--bottom" options control the page margins of the output. The defaults are 1 inch (25mm) for the left and 0.5 inches (12mm) for the right, top, and bottom margins.

Setting the Default Font Typeface and Size

The default font size, spacing, and typefaces are controlled by the "--fontsize", "--fontspacing", "--bodyfont", and "--headingfont" options: The typefaces for "--bodyfont" and "--headingfont" can be "courier", "times", or "helvetica".

Customizing the Page Headers and Footers

The "--header" and "--footer" options allow you to customize the headers and footers used for the document body. Each option requires a three character string that specifies the left, middle, and right fields:
Char Description
. A period indicates that the field should be blank.
t A "t" indicates that the field should contain the document title.
h An "h" indicates that the field should contain the current heading.
l A lowercase L indicates that the field should contain the logo image.
1 The number 1 indicates that the field should contain the current page number in decimal format (1, 2, 3, ...)
i A lowercase I indicates that the field should contain the current page number in lowercase roman numerals (i, ii, iii, ...)
I An uppercase I indicates that the field should contain the current page number in uppercase roman numerals (I, II, III, ...)
The "--tocheader" and "--tocfooter" options control the header and footer on table-of-contents pages.

Setting the Header and Footer Font

The "--headfootsize" and "--headfootfont" options set the size and typeface of the font used for the page headers and footers:

Disabling Document Compression

Normally each page in a PDF file is compressed using the Flate method (GZIP). Versions of Acrobat Reader prior to 3.0 do not understand Flate compression. To disable compression use the "--no-compression" option.