It turns out that DocBook conversions have been written, though not
exactly as separate traditional programs. Instead, DocBook
developers have set up a joint
open-source project
at SourceForge with lots of DocBook conversion tools. Instead of writing these
tools in traditional computer languages, they’ve built them on
top of general XML conversion engines (and older SGML conversion
platforms, but we’re sticking with XML). The conversion engines
being used are called XSLT processors.
An XSLT (eXtensible Style Language Transformations) processor
takes an XML file as input, along with an XSLT style
sheet (which is another XML file in a particular
syntax for XSL), and produces as output whatever the style
sheet specifies.
We’re going to want to perform these
transformations, so we’re going to at least need the XSLT style
sheets and an XSLT processor. We’ll start by getting the style
sheets and putting them
in a C:\Program Files\docbook directory:
Procedure 3.1. Initial docbook Directory Setup.
-
Create a folder called docbook in
C:\Program Files. -
Download the latest stable distribution of the
DocBook
XSL style sheets to a temporary directory. You’ll
want the ZIP file
version for Windows machines. The
tar.gz version has
the exact same contents, but would be harder to use on a
typical Windows PC. -
Extract the files in the ZIP
file you just downloaded to your
C:\Program Files\docbook folder. This
will create a folder named something like
C:\Program Files\docbook\docbook-xsl-1.64.1
(the latest version when I wrote this note was 1.64.1). -
Rename the newly created folder to xsl.
That way we can always just refer to the
C:\Program Files\docbook\xsl folder
without worrying about exactly which version we have.
You now have the style sheets available to use. If you look at
the folder you just created, you’ll see that it has a fairly complex
structure:
Directory of C:\Program Files\docbook\xsl 12/19/2003 12:18 PM <DIR> . 12/19/2003 12:18 PM <DIR> .. 11/02/1999 09:18 AM 240 BUGS 12/19/2003 09:04 AM 7,871 ChangeLog 12/19/2003 12:18 PM <DIR> common 12/19/2003 12:18 PM <DIR> doc 12/19/2003 12:18 PM <DIR> docsrc 12/19/2003 12:18 PM <DIR> eclipse 12/19/2003 12:18 PM <DIR> extensions 12/19/2003 12:18 PM <DIR> fo 12/19/2003 12:18 PM <DIR> html 12/19/2003 12:18 PM <DIR> htmlhelp 12/19/2003 12:18 PM <DIR> images 12/19/2003 12:18 PM <DIR> javahelp 12/19/2003 12:18 PM <DIR> lib 12/19/2003 12:18 PM <DIR> manpages 12/19/2003 12:18 PM <DIR> params 12/19/2003 12:18 PM <DIR> profiling 10/23/2002 07:00 AM 3,803 README 12/19/2003 09:00 AM 44,884 RELEASE-NOTES.html 12/19/2003 08:50 AM 33,104 RELEASE-NOTES.xml 12/19/2003 12:18 PM <DIR> template 04/02/2001 08:44 AM 70 TODO 12/19/2003 12:18 PM <DIR> tools 12/17/2003 09:26 AM 2,900 VERSION 12/19/2003 09:06 AM 12,004 WhatsNew 12/19/2003 12:18 PM <DIR> xhtml 8 File(s) 104,876 bytes
The stylesheet to convert DocBook to HTML is in the
html folder. Other subfolders have stylesheets
for other target formats, or to help in using those stylesheets.
In order to use a stylesheet, you’ll also need an XSLT processor.
There are a lot to choose from, and I’ve tried several. For example,
under Windows, you might want to use
MSXSL.
It’s described and available for download at that link (until
Microsoft moves it; google msxsl site:microsoft.com
to find it again).
Once you’ve downloaded and installed MSXSL,
you can convert your DocBook files to HTML with a simple command
(the command is all on one line. I’ve used a \ to show
where the contents of the following line really should be; don’t
type the trailing \ at the end of the line):
msxsl myfile.docbook \ "c:\Program Files\docbook\xsl\html\docbook.xsl"
The HTML file will be sent to standard output. You can either
redirect that output to a file, or use the
-o parameter to specify an
output file:
msxsl -o myfile.html myfile.docbook \ "c:\Program Files\docbook\xsl\html\docbook.xsl"
These commands are a bit unwieldy, so we will create batch files
for them. Before doing that, however, we will install a different
XSLT processor for our use. It turns out that some output formats
require the XSLT processor to have capabilities that aren’t yet
standard, and MSXSL doesn’t have all
the ones we need. Next time we will install and use
Saxon, an XSLT processor that does
seem to meet all DocBook requirements.