TclHTML

Contents

1  Introduction
1.1  Motivation
1.2  Document history
1.3  Version numbers
2  How it works
3  Tcl commands for HTML elements
3.1  Types of HTML element
3.2  Paragraph elements
3.3  In-line elements
3.4  Structure elements
3.5  Empty elements
3.6  Attributes
4  Higher-level commands
4.1  Metadata
4.2  Text commands
5  TclHTML framework
5.1  Do your own tags: push, emit and pop
5.2  Information about the TclHTML processor
5.3  TclHTML Options
5.4  XHTML and XML support
5.5  html, head, and body
6  Index

Using Tcl to generate HTML and XHTML files

Revision 5.1.4 (P. Damian Cugley, 2000-03-19), for TclHTML version 5.1

1.  Introduction

1.1.  Motivation

This handbook describes a system for generating HTML file using scripts written in the tool command language Tcl. I found this approach useful because

I chose Tcl (John Ousterhout's Tool Command Language) as the basis for my system because it is the scripting language I'm most familiar with, and because I was inspired by Don Libes's approach to using Tcl to deliver CGI pages.

On my site, for example, the "navigation bar" aspects are generated using a Tcl script which recognizes where the current page falls within the site's structure, and adjusts the HTML code produced accordingly. What's more, when I got tired of its appearance, I wrote new procs which generated completely new layouts, thus changing all the pages on the site by altering one file.

As another little example, the revision number for this document is extracted from the RCS id embedded in the script generating this HTML page. At the start of the file some Tcl code (described later) extracts the file version and last-changed date in to Tcl variables fileVersion and fileDate. Later in the script there are these two lines:

h1 "Generating HTML files with TclHTML"
p "Revision $fileVersion (P. Damian Cugley, $fileDate)"

The first generates the heading, and the next uses the Tcl variables to supply the version information in the next paragraph.

1.2.  Document history

This is the first version of the TclHTML manual.

1.3.  Version numbers

The version of Tcl HTML is 5.1. The major version is 5 because I've had several goes at HTML generation in the past (none of which I have published in their own right). The plan is that this will be incremented in future when I make incompatible changes.

The minor number will be increased for future versions that are backward-compatible (apart from bugs). My plan is to follow the Linux convention of using odd minor numbers for the version I'm working on and even numbers for "stable" versions.

Finally, I am using CVS to insert version numbers in to the script and documentation files, with CVS's revision number serving as the minor numbers of the file versions. For example, this file's CVS id is 1.4, so the version number we report is 5.1.4. (The script orgiinally used the RCS id, and added an extra .1 when the file is checked out for editing. This does not apply now I use CVS to store revisions.)

set id {$Id: tclhtml.th,v 1.4 2000/03/19 23:23:36 pdc Exp $}
set fileVersion 5.[lindex $id 2]
regsub -all / [lindex $id 3] - fileDate
if {[llength $id] > 8} {
    append fileVersion .1
    set fileDate [clock format [file mtime [info script]] -format %Y-%m-%d]
}

I use the regsub command to convert RCS dates to ISO 8601 format (2000-03-19 rather than 2000/03/19).

2.  How it works

I use the file-name suffix .th for TclHTML scripts. The TclHTML compiler thc reads in some extra Tcl procs and then executes the .th file as a Tcl script, which is expected to generate the HTML file in the current directory. That is, it follows the usual conventions for compilers. In practice you probably will want to create a makefile that takes care of invoking this "TH compiler" automatically, and which has an install target for copying the files to your web server.

Here's a small sample TclHTML script:

# hello.th		-*-tcl-*-

beginDocument {
    title "Hello, World!"
}
h1 "Hello, World!"
p "Gxis revido!"
endDocument

This produces HTML code like the following:

<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN'
'DTD/xhtml1-transitional.dtd'>
<html xmlns="http://www.w3.org/1999/xhtml">
  <!-- Generated from hello.th on 2000-03-19 23:26 GMT -->
  <!-- with command: cd /home/pdc/devel/tclhtml/sourceforge;
  /home/pdc/bin/thc -s . hello.th -->
  <!-- htmlProcs.tcl version 5.1.4 -->
  <head>
    <title>Hello, World!</title>
    <meta http-equiv="Content-type" content="text/html; charset=UTF-8" />
  </head>
  <body>
    <h1>Hello, World!</h1>
    <p>Gxis revido!</p>
  </body>
</html>

There aren't any hard-and-fast rules for the contents of .th files -- they can be arbitrary Tcl scripts. That said, the typical document file should go like this:

# name.th -- 					-*-tcl-*-
#  Generator for short description of the document

?version control information?
?source commands for any extra library files?

beginDocument ?-file name.html? ?attr...? {
    title "title"
    ?other metadata?
}

contents of the document

endDocument

The comments at the top are for the benefit of human readers. The incantation -*-tcl-*- is for the sake of Emacs, which uses that as a hint to edit the file using Tcl-mode.

My suggestion is that only the briefest definitions go before the beginDocument command, such as any version information, and invocations of the source and package require commands.

Next is the beginDocument command which tells us the name of the output file, its metadata, and the attributes (if any) of the body element. The signature of the beginDocument command is as follows:

beginDocument ?-file outFileName? ?--? ?attr...? script

The outFileName argument, if supplied, is the file name to write the HTML code to. Otherwise thc derives this from the input file name by replacing the .th suffix with .html. This option can be used to generate several HTML files in one script.

The attr arguments are attributes for the body element that will enclose this page's content, in the format described below. For example, bgcolor=#FFFFCC.

The script parameter is a Tcl script containing commands that specify the metadata for this document, such as title, link and meta. This is used to generate the head element of this document. Here's an example of beginDocument in action:

beginDocument bgcolor=#EEEEFF text=#000000 link=#990000 vlink=#330000 {
    title "Generating HTML files with TclHTML"
}

(I have to admit that the mixture of body attributes with the head content is inconsistent with the other block-element commands. I did it this way because there are no interesting attributes for head elements, and I did not want to have to have the entire body of the page enclosed in one long script.)

Finally, at the end of the document is the endDocument command, which has no parameters. This emits the close-tags for the open body and html elements and closes the output file stream.

3.  Tcl commands for HTML elements

3.1.  Types of HTML element

The general approach we use is very similar to Libes's. We have Tcl procs corresponding to the HTML elements. These elements can be divided into three categories:

This list is slightly different from the official HTML recommendations. The HTML specification groups both sorts of block element together, for example, and allows character data to appear within structure elements, whereas I restrict them to containing other blocks. Here's a concrete example:

(a) (b)
<body>
  <h1>Foo</h1>
  Bar
  <p>Baz
</body>
<body>
  <h1>Foo</h1>
  <p>Bar</p>
  <p>Baz</p>
</body>

I like to think of fragment (a) as equivalent to fragment (b). It might be that the DTD for HTML could be tweaked make this work, through the magic of SGML's minimization rules. In fact this is not the case -- the HTML 4 specification makes it explicit that the undecorated Bar is not a paragraph. On the other hand, in the absence of stylesheets, all browsers display the two examples identically (so far as I know). You see the difference only when you use style sheets to change the appearance of p elements.

To avoid this sort of confusion, I try to always enclose text in p tags (like HTML fragment (b)), which means that in my documents, body elements contain only block elements, and no character data. (Sometimes getting HTML to work in various browsers requires that this rule be broken. The TclHTML system allows for this.) Thus my division of block elements into paragraphs and structures.

We use different styles of proc in each of the three cases.

3.2.  Paragraph elements

For paragraph elements we use functions taking arguments in this pattern:

tagName ?attr...? text

Where the tagName is the HTML tag name, in lower case (such as em, p), the attrs are attributes for that elemement (like class=var, align=right), and the text is the content of the element. For example, consider this this Tcl command:

p "Hello, world!"

Here the text Hello, World! will be written to the output file as a the content of a paragraph (p) element, producing the following HTML fragment:

<p>Hello, world!</p>

In this case the paragraph fitted on one line; when there are newlines in the text string, a slightly different format is used. Here's an example (Tcl on the left and HTML on the right):

Tcl HTML
p  "Now is the time for all good men
to come to the aid of the party."
<p>
  Now is the time for all good men
  to come to the aid of the party.
</p>

This doesn't make any difference to how the HTML is ultimately displayed, but it does make things a little easier for people reading the HTML file.

The following Tcl commands generate paragraph-like block elements:

blockquote, dd, dt, h1, h2, h3, h4, h5, h6, li, p, pre, td, th, and title

The blockquote command has a corresponding structural version called blockquote*, for when you want to have a quotation containing paragraphs, lists, etc.

The title element goes within a head element, not in the body of the document.

The command pre is special because its contents are written verbatum -- that is, the indentation of the lines is not adjusted, and newlines and whitespace at the start and end are not trimmed. It also takes an optional flag -encode which causes the characters special to HTML (&, < and >) to be encoded as character entities (&amp;, &lt;, &gt;). This is useful when you want to pass a string, to be reproduced literally. For example, suppose you want to include a source file verbatum in a document. This should to the trick:

set in [open creat.c r]
pre -encode -- width=80 [read $in]
close $in

Here's the full syntax of the pre command:

pre ?-encode? ?--? ?attr...? text

3.3.  In-line elements

We use quotes rather than braces to surround the text argument passed to paragraph-like elements. This means it can contain Tcl functions enclosed in square backets [...]. Obviously this allows any Tcl script to be used to generate content. In particular, we use this notation to represent in-line HTML elements. For each element, there is a Tcl command. This command returns the approriate HTML fragment as its result, instead of emitting it direct to the output file. Here's an example (Tcl on the left, HTML code in the middle, and the resulting display on the right):

Tcl Html Results
p "Hello, [em world]!"
<p>Hello, <em>world</em>!</p>

Hello, world!

The em command returns the HTML fragment for what it represents: <em>world</em>. Thus the argument passed to the p command is Hello, <em>world</em>!. The p command emits HTML code into the HTML file.

There are two ways in which text with multiple words can be represented as arguments to a Tcl command -- as one argument per word, or as one arg. The former turns out to be more convenient in most cases in files, whereas the latter is occasionally required when you want to include attributes as well as content. In the end the compromise I came up with was to create two commands for each in-line element. The first command has this syntax:

tagName ?text...?

The text arguments are collected together with spaces between them and enclosed in matching tags, and the result returned as the value of the function. This is the version used in the example above. Here is an example with a multi-word argument:

Tcl HTML
p [em Hello, world!]
<p><em>Hello, world!</em></p>

The second version has its name formed by adding an asterisk to the name of the HTML tag, and has this syntax:

tagName* ?attr...? text

Here's an example:

Tcl HTML
p [em* class=greeting "Hello, 
world!"]
<p>
  <em class="greeting">Hello,
  world!</em>
</p>

The following Tcl commands produce in-line elements in this fashion:

b, big, code, dfn, em, font, i, s, small, span, strong, sub, sup, tt, var,
b*, big*, code*, dfn*, em*, font*, i*, s*, small*, span*, strong*, sub*, sup*, tt*, and var*

The element a is a slightly special case. The simple verison has one required argument before the text arguments:

a uri text...

The first argument is the value to be used in the href attribute of this element. For example:

Tcl HTML
p "Jump to 
[a http://www.w3.org/ W3C]"
<p>
  Jump to
  <a href="http://www.w3.org/">W3C</a>
</p>

This command is only useful for making source anchors. To make target anchors you must use the *-form of the command, a*:

Tcl HTML
h2 "[a* name=2 "Chapter 2."]
Frogs and Toads"
<h2>
  <a name="2" id="2">Chapter 2.</a>
  Frogs and Toads
</h2>

To make things more interesting, the XHTML recommendation deprecates the attribute name in favour of the attribute id. In an attempt to be forwards and backwards compatible, TclHTML will add the attrbute id in XHTML files, and replace name with id in XML files.

3.4.  Structure elements

Structure elements are those HTML elements which contain other block elements. This includes ul (which contains li elements), table, and body, for example.

The Tcl command we use for each of these elements has this syntax:

tagName ?attr...? script

As before, tagName is the same as the name of the tag, in lower case, and the attr parameters (if any) are HTML attributes to be included in the opening tag. The final argument, the script, is a Tcl script which is evaluated to produce the content of the element. Here's an example:

Tcl Html Results
ul {
    li "First"
    li "Second"
}
<ul>
  <li>First</li>
  <li>Second</li>
</ul>
  • First
  • Second

The Tcl fragment on the left produces the HTML fragment on thr right. Here the ul command's script argument contains two li commands, which produce the li elements inteh HTML. The thc program takes care of producing systematic indentation, as well. This is intended to make the resulting HTML code easier for humans to read.

Simple tables work the same way, with table and tr as structure commands and td as a paragraph command:

Tcl Html Results
defaultAttrs tr align=right
table border {
    tr valign=baseline {
        th "[em x]"
	th "[em x][sup 2]"
    }
    tr {
	td 1; td 1
    }
    tr {
	td 2; td 4
    }
    tr {
	td 3; td 9
    }
    tr {
	td 4; td 16
    }
}
<table border="border">
  <tr valign="baseline">
    <th><em>x</em></th>
    <th><em>x</em><sup>2</sup></th>
  </tr>
  <tr align="right">
    <td>1</td>
    <td>1</td>
  </tr>
  <tr align="right">
    <td>2</td>
    <td>4</td>
  </tr>
  <tr align="right">
    <td>3</td>
    <td>9</td>
  </tr>
  <tr align="right">
    <td>4</td>
    <td>16</td>
  </tr>
</table>
x x2
1 1
2 4
3 9
4 16

The trick is what happens when you need to have table cells containing block elements -- such as when using tables to get a fancy page layout, or simply for tables containing paragraphs of text. It turns out that sometimes you need td to be a paragraph-like element, and sometimes you need it to be a structure. We get around this by defining a separate command td* which does structures. Here's an example:

Tcl Html Results
defaultAttrs tr align=left
table border {
    tr {
	td* {
	    ul {
		li One
		li Two
		li Three
	}   }
	td* {
	    ol {
		li Unu
		li Du
		li Tri
}   }   }   }
<table border="border">
  <tr align="left">
    <td>
      <ul>
        <li>One</li>
        <li>Two</li>
        <li>Three</li>
      </ul>
    </td>
    <td>
      <ol>
        <li>Unu</li>
        <li>Du</li>
        <li>Tri</li>
      </ol>
    </td>
  </tr>
</table>
  • One
  • Two
  • Three
  1. Unu
  2. Du
  3. Tri

As always with the *-variants on commands, the command without the * is the simpler version.

3.5.  Empty elements

Empty elements are those with no content, and hence no end tag. There are two sorts of empty element -- paragraph and in-line. They are implemented by commands like those for non-empty paragraph and in-line elements, except without the final parameter:

tagName ?attr...?

Empty paragraph elements are: link, meta, hr, and br*.

Empty in-line elements are: img, and br.

The break element br is a bit of a tricky case, since sometimes it definitely belongs between paragraphs (often in the forms like <br clear='all' />), and sometimes definitely within paragraphs (usually as a plain <br /> between lines of poetry, or lines of an address, and so on.

You will notice that the syntax used for empty elements is a little peculiar: there is a space an a slash before the closing > character. This is a feature of the XHTML 1.0 recommendation of the W3C, and is designed to make the resulting file understandable to XML readers as well as HTML readers.

3.6.  Attributes

The attribute arguments in the Tcl commands we use to generate HTML elements all take optional arguments that form attributes in the emitted element. These fall in three forms:

key=value
value
~key

The first two have their usual SGML/HTML meaning. The third form stands for the absence of an attribute, and is used to suppress a default attribute (described below).

You do not add quotation marks around the value part of the attribute or to encode special characters like > and ". TclHTML will add quotes automatically. Here's an example:

Tcl HTML
table width=100% align=left {}
<table width="100%" align="left">
</table>

TclHTML uses double quotes (") because this always works, even if the attribute value contains an apostrophe. (Quotation marks in the attribute value are not a problem because they get encoded as &quot;.)

The XHTML 1.0 recommendation mandates a peculiarly verbose syntax for those attribute we used to write without an equals sign. Instead of the old-fashioned <hr noshade> we use <hr noshade='noshade' />. This is tough on older HTML readers, which don't understand the long form of these attributes.

TclHTML uses a special shorthand for those elements which have a required attribute whose value is a URI (a, img, link, and base, but not a*). In this case, the URI is the first argument if the Tcl command, and the href= (or src=) is unnecessary:

Tcl HTML
p [a install.html INSTALL]
<p><a href="install.html">INSTALL</a></p>

For every tag it is possible to set default attributes, using the command defaultAttrs:

defaultAttrs tagName ?attr...?

If no attr arguments are included, then this command returns the existing list of defaults attributes for this tag. Otherwise these attributes are stored as the defaults for all elements generated with this tag name. Defalts are always overridden by attributes in the tag command line. For example:

Tcl HTML
defaultAttrs hr size=1 noshade
hr
hr size=2
hr ~noshade
<hr size="1" noshade="noshade" />
<hr size="2" noshade="noshade" />
<hr size="1" />

The third example shows the use of ~ to suppress the inclusion of a default attribute altogether.

Image tags (img) have automatically generated defaults for the attributes width and height. TclHTML examines the graphics file using the PBMPlus toolkit, and extracts the width and height of the image. This way you need only supply these attributes when you intend to resize the image.

p "[img tclhtml-80x40.gif alt=jam]
[img tclhtml-80x40.gif alt=jam width=160 height=80]
[img tclhtml-80x40.gif alt=jam ~width height=20]"

<p>
  <img src="tclhtml-80x40.gif" alt="jam" width="80" height="40" />
  <img src="tclhtml-80x40.gif" alt="jam" width="160" height="80" />
  <img src="tclhtml-80x40.gif" alt="jam" height="20" />
</p>

jam jam jam

The third example shows how, in order to not have a height or width attribute at all, you must use the ~attr convention.

4.  Higher-level commands

If all that TclHTML offered were the element-producing commands then it would not be much more than a different syntax of HTML files. In the course of building a site using TclHTML, you create Tcl commands for all the common elements, which expand into HTML code using the primitives described in ยง3. We have supplied a few simple shorthands in the TclHTML library:

4.1.  Metadata

stylesheet uri ?attr...?
Names an external stylesheet for this document. Same as link href=uri rel=stylesheet type=text/css. Goes in the head element (or in the script argument of beginDocument).
keywords keyword...
Supply keywords for the sake of some search engines. This is equivalent to meta name=Keywords "content=keyword..."
description text...
Supplies a short summary of the document, for the sake of some search engines. Equivalent to meta name=Description "content=text".
refresh seconds ?uri?
This generates the meta elkement that causes this page to be reloaded automatically by clients thatunderstand this "client pull" convention. The seconds argument is an integer giving the delay in seconds before the new page is loaded. The uri argument si the address of the page to replace it with. If it is omitted then the present page wil be loaded again. This is all equivalent to the following command: meta http-equiv=Refresh "content=seconds; url=uri"

4.2.  Text commands

q text...
HTML 4 defines an in-line element q representing an in-line quotation. The user's browser is expected to supply the quotation marks. In practice, no browser that I know of supports this element, so as a medium-term work-around we have a Tcl command which inserts quotation marks itself:
p "Hello [q world]!"
<p>Hello &quot;world&quot;!</p>

5.  TclHTML framework

Having touched on higher-level commands, we now plunge into the depths of the lower-level TclHTML primitives. You should not need to worry about the facilites described here for everyday use of TclHTML.

5.1.  Do your own tags: push, emit and pop

The structural tag commands (like ul and table) and paragraph tags (like p and td) are all implemented using primitive commands push, emit and pop. For example, the fragment

ul {
    li Foo
}

Is more or less equivalent to this:

push ul
push li
emit Foo
pop
pop

Here's details of these commands.

push tag ?attr...?
Open an HTML element with name tag and the specified attributes. `Open' means emitting the start tag and increasing the prevailing indentation. This command maintains a stack of open elements.
pop
Closes the most recent element opened with push. This means emitting the close tag and decreasing the prevailing indentation. Returns the tag removed from the stack.
depth
Returns the depth of the stack (that is, the number of open elements), as an integer.
isElementOpen tag
Returns nonzero iff there is an open element named tag.
emit text
Write the text in to the HTML file, adjusting the indentation if each line to match the prevailing indentation.

The general pattern for commands like tr, say, is first to check that a suitable containing element is open, using "if {[isElementOpen table]}..." and then use push tr $attrs to open a tr element. In the case of tr, the final arg is evaluated as a Tcl script, using the Tcl command uplevel; a command like td just emits its final argument.

You can use these commands in TclHTML documents when you need to develop a paragraph piece by piece, say. Here's a trivial example:

Tcl Html Results
push p
set max 12
emit "The first $max 
square numbers are:"
for {set i 1} {$i < $max} {incr i} {
    emit "[expr $i * $i],"
}
emit "and [expr $max * $max]."
pop
<p>
  The first 12
  square numbers are:
  1,
  4,
  9,
  16,
  25,
  36,
  49,
  64,
  81,
  100,
  121,
  and 144.
</p>

The first 12 square numbers are: 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, and 144.

5.2.  Information about the TclHTML processor

Another TclHTML command gives access to information about the TclHTML system, an a fashion similar to the built-in Tcl command info.

id Returns the CVS id for the script defining the TclHTML procs. (Originally the RCS id, which is in the same format.) See also versionMajor and version. $Id: htmlProcs.tcl,v 1.4 2000/03/18 09:36:20 pdc Exp $
inFileName Returns the name of the .th file. tclhtml.th
outFileName Returns the name of the HTML file being written. tclhtml.html
version Reurns the verion number for the script containing the TclHTML definitions. 5.1.4
versionMajor Returns the major version number for the TclHTML package, which is 5. 5

5.3.  TclHTML Options

Some details of the way TclHTML operates are controlled by options set with the htmlOption command. The idea is similar to the Tk command option. the syntax for the command is as follows:

htmlOption key ?value?

If the optional argument value is supplied, then this command changes the value of the option named by key. Otherwise this command returns the current value of that option.

Option thc flag Description Value for this file
out   The file channel identifier used by the emit command to write HTML text. By default this is set by the beginDocument command to the equivalent of "[open [htmlInfo outFileName] w]". file4
htmlRootDir -r The file name of the directory that corresponds to the root of your HTML document tree. Which directory you use as your "root" is up to you; it might be ~/public_html, corresponding to http://foo.com/~joepublic/. .
rootDir   The inverse of the subDir option: the root of the document tree, relative to the subDir option. If subDir is docs/manual, then this option will be ../... Set automatically when subDir is set. .
subDir -s The directory this document is destined for relative to the document HTML root. For example, if subDir is doc/manual and htmlRootDir corresponds to ~/public_html, then this document's URL would be ~/public_html/docs/manual/tclhtml.html. Used when searching for image files. .
syntax   some details of the syntax of the HTML that is generated. See the discussion of XHTML and XML support. The permitted values are: html, html/4, xhtml, xhtml/1, and xml. xhtml

The option htmlRootDir indicates where HTML files will be deposited when this documetn is installed; TclHTML does not copy the files there itself. You may want to use thmkmf to create a Makefile that takes care of copying the files for you.

5.4XHTML and XML support

The world of HTML and XML is (at the time of writing) in a state of transition from HTML, some of which is not XML-compatible, to XML, some of the syntax of which is not compatible with older HTML parsers.

XML is a more generic successor to HTML. It is simplified in the sense that some of what SGML calls minimization has been removed. For example, in HTML, border=1 and border='1' are equivalent. In XML, only the second of these is permitted. XML is more generic than HTML in that HTML has a fixed vocabulary of tags, whereas XML allows for any number of document-types to be defined, and even combined (using a feature called XML namespaces).

XHTML 1.0 is a recent (January 2000) reformulation of HTML 4.01 as an application of XML. This means that XHTML provides the same vocabulary of tags as HTML, using the XML syntax. Because the permitted XML syntax is a subset of HTML's, it follows that XHTML-1.0 documents should, by and large, be readable by HTML parsers (like your favourite web browser), while also being compatible with XML readers that don't understand the more messy HTML syntax. The XHTML reccomendation thus forms a bridge between the confused HTML world of the 1990s and the furturistic world of XML.

One of the advantages of the TclHTML approach is that switching it to generating XHTML instead of HTML is pretty easy, and does not require changing any .th files at all. In the future, if it becomes necessary to tweak the syntax again, this should also be easy.

The syntax option tweaks some of the details of the format of the code produced by TclHTML. This may be necessary because there are some incompatibilites between XHTML and the older formulations of HTML. The chief offender is attributes written without the equals sign, like noshade discussed above. HTML 3.2 and HTML 4.0 parsers should have no trouble with these, but really old programs might. The only effect should be that they ignore the attribute, which is generally harmless but not as pretty. You can force a return to the old syntax by doing the TclHTML command

htmlOption syntax html

before the beginDocument command. (The keyword html must appear in lower case.) There is another permitted value, xml. The only effect of that at present is to force the document to start with an XML processing instruction <?XML...?> and the treatment of name attributes.

html html/4 xhtml, xhtml/1 xml
Empty elements <hr> <hr /> <hr /> <hr/>
Minimized attributes noshade noshade='noshade' noshade='noshade' noshade='noshade'
name and id name=xxx name=xxx name="xxx" id="xxx" id="xxx"
XML declaration -- -- -- <?xml version='1.0' encoding='UTF-8'?>
DOCTYPE <!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.0 Transitional//EN' 'http://www.w3.org/TR/REC-html40/loose.dtd'> <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'DTD/xhtml1-transitional.dtd'>

5.5html, head, and body

We have already discussed the top-level beginDocument and endDocument commands. These commands take care of the HTML elements html, head and body for you. You can instead generate these elements "manually".

(to be continued...)

6.  Index

.th

a*

a*

a

a

b*

b

base

beginDocument

beginDocument

big*

big

blockquote*

blockquote

body

br*

br

code*

code

dd

defaultAttrs

description

dfn*

dfn

dt

em*

em

emit

font*

font

h1

h2

h3

h4

h5

h6

height

hr

i*

i

id

img

img

img

keywords

li

li

link

link

meta

name

p

pop

pre

push

q

refresh

s*

s

small*

small

span*

span

strong*

strong

stylesheet

sub*

sub

sup*

sup

table

td

th

thc

tt*

tt

ul

var*

var

width

~

~

~

attribute

CVS

PBMPlus

quotation marks

SGML

URI

XHTML

XHTML

XHTML