From AsciiDoc/DocBook to PDF, ePub and mobi - part 1

First of all, it was obvious to me that I will use AsciiDoc to write books. The rest was only a matter of implementation. ;)

In this serie of blog post I will describe the details of how I have used AsciiDoc, DocBook & co. to generate a book in various formats.

Disclaimer

I'm not AsciiDoc guru. Neither DocBook guru. Neither xsltproc, fop or a2x guru. I only happened to find the right combination of these tools to generate what I want. I would be happy to hear your comments (there are surely some things to improve in my toolchain!), but I can't promise to answer your questions. I think you would be better of by posting to appropriate mailing lists.

And BTW. since I have created my toolchain (circa about Q1.2012) things have changed. New projects emerged (e.g. AsciiDoctor) and it is possible that you can do some things better/easier because of this. Still, this is what I use (and it works!), so I think it is worth sharing.

Links

Few links to get you started:

The Task

The task was to create a nicely-formatted book in HTML, PDF (different page sizes), ePub and mobi (Kindle) formats from the plain text files written with AsciiDoc.

The Result

You can see the proof that it all went well on this website: http://practicalunittesting.com. :)

All versions (PDF, ePub, mobi etc) were generated using a simple bash script. I will explain with details how it was done.

HTML

HTML is for me and is all about convenience and fast feedback. I use it a lot while writing. During processing to HTML some errors are reported which means I can correct them fast. It is not so picky as DocBook, but you can eliminate a lot of issues just by converting to HTML. The other use is that while writing I was able to see the changes very quickly by looking at the HTML output.

The script to generate HTML is very simple:

asciidoc -a icons -a iconsdir=images/icons_html -n -d book -a tabsize=4 -a toc2 -a toclevels=3 -n -o $TARGET/junit_book.html $MAIN_FILE

The toc2 attribute generates table of contents (links to specific sections) on the left side which is very convenient.
The icons attribute tells asciidoc to replace TIP, NOTE and WARNING texts with images, and iconsdir attribute shows where to look for them (instead of the default images/icons). The thing is that for HTML version (and later also for ePub/Kindle) I use different (smaller) icons than for PDFs.
The tabsize sets the indentation of code listings (I use different indentation for ePub/Kindle).
As you probably guessed MAIN_FILE is the source txt file written with AsciiDoc.

An additional thing that you need to do is to copy all images (to the $TARGET directory) so they are reachable from the output HTML file.

txt

After the HTML is ready I can create nice txt out of it. I haven't used this txt output a lot, but sometimes it was handy. For example, in some cases I compared two versions by diff-ing their txt representations.

html2text -nobs -style pretty $TARGET/junit_book.html > $TARGET/junit_book.txt

To be Continued...

Ok, that would be it for now. In the next installment I will dive deep into some fancy DocBook stuff. Stay tuned!

 
 
 
This used to be my blog. I moved to http://tomek.kaczanowscy.pl long time ago.

 
 
 

Please comment using