From AsciiDoc/DocBook to PDF, ePub and mobi - part 3

We already know how to generate HTML, TXT and PDF, but since everyone is using smartphones/tablets nowadays we need to go mobile. This time we will look into ePub and mobi (Kindle) formats.

This is the third post of the serie devoted to the technical side of self-publishing using AsciiDoc and Docbook. Look here to find the rest of them.

ePub

ePub is simply HTML in disguise. I have used a2x to generate the ePub version of my books (it is also a one-liner but reformatted for improved readability):

a2x -k -f epub -a docinfo
    --attribute tabsize=2
    --icons
    -d book
    --xsltproc-opts
        "--param local.l10n.xml
            document\(\'$SRCDIR/resources/custom-format.xml\'\)
        --stringparam admon.graphics.path images/icons_epub/
        --stringparam callout.graphics.path images/icons_epub/callouts/
        --stringparam navig.graphics.path images/icons_epub/"
    --fop $MAIN_FILE
    -v -D $TARGET

As you can see I use smaller tabsize than previously (for PDFs it was 4, now is only 2). Also, there are different icons and callouts used - I will talk about them later.

Regarding the cover, this is handled by the book-docinfo.xml file (watch out - the name should reflect the name of file you convert, so in your case it might be my-book-about-my-grandma-docinfo.xml) which contains the following declaration of mediaobject with role attribute set to cover:

<mediaobject role="cover">
  <imageobject>
     <imagedata
        fileref="img/practical-unit-testing-with-testng-and-mockito.jpg"
        format="JPG"/>
  </imageobject>
  <textobject>
    <phrase>Practical Unit Testing with TestNG and Mockito</phrase
  </textobject>
</mediaobject>

AFAIK this is ignored by PDF but taken into consideration when generating ePub.

Ah, and one more thing. In general, tables do not look good when viewed on smartphones in ePub format. Avoid them.

Kindle / mobi

The next thing I want to share with you is the creation of mobi (Kindle) files form AsciiDoc. Here is what I have used.

NOTE: I haven't tried this: http://manual.calibre-ebook.com/cli/ebook-convert.html but maybe you should?

I'm not an expert on Kindle, so feel free to comment if I say something wrong.

I used Calibre (http://calibre-ebook.com/) to convert ePub into mobi file.

When I browsed the book using Kindle for the first time I was surprised by the lack of tables borders. It seems to be a missing feature - at least for this version of kindle (see http://www.mobileread.com/forums/showthread.php?t=122612). One more reason to avoid tables.

The icons I use to mark some important parts of the text (note, tip, warning) were too big for Kindle. I scaled them from 96x96 to 64x64. Fortunately, the icons I use are available also as SVG so that was not a problem.

The a2x execution looks like this:

a2x -k -f epub
    -a docinfo
    --attribute tabsize=2
    --icons -d book
    --stylesheet=resources/kindle.css
    --xsltproc-opts
        "--param local.l10n.xml
            document\(\'$SRCDIR/resources/custom-format.xml\'\)
        --stringparam admon.graphics.path images/icons_epub/
        --stringparam callout.graphics.path images/icons_epub/callouts/
        --stringparam navig.graphics.path images/icons_kindle/"
    --fop $MAIN_FILE
    -v -D $TARGET

As you can see it is exactly what I used for ePub with the exception of the stylesheet attribute.

The main problem was with listings. The fonts were too big, and the left margin unnecessarily wide. Because of this, the listing were barely readable - all lines were broken, and it looked very bad. I improved the situation by tweaking CSS stylesheets used to generate ePub version (which was later transformed by Calibre to mobi). My CSS-fu is not very strong, but strong enough to do implement such simple change.

The kindle.css looks like this:

.programlisting, .screen {
  border: 1px solid silver;
  background: #f4f4f4;
        margin: 0.5em 0 0.5em 0;
  padding: 0.5em 1em;
}

There was some formatting issue with two-level bullet point (and numbered) lists. The culprit was (probably?) a2x (see here: https://groups.google.com/forum/#!topic/asciidoc/pqQRYOpNWS8), which is used to convert AsciiDoc to ePub. It inserts unnecessary <p class="simplepara"> for seconde level of bullet point or numbered lists.
I fixed this by changing HTML files with SED (http://www.gnu.org/software/sed/manual/sed.html). As you probably know ePub is simply a set of HTML files zipped together so it was a matter of simple text-parsing. First I had to find the incorrect places in the generated HTML files, and then to create appropriate SED commands.

For example this part of HTML contains unnecessary <p> tag:

</li><li class="listitem"><p class="simpara">
Share your knowledge and experience (or lack thereof) with others. You can do it in various ways:
</p>

To render well we should remove this tag so the output is:

</li><li class="listitem">
Share your knowledge and experience (or lack thereof) with others. You can do it in various ways:

I have written such scripts to get rid of the not-wanted <p> tags:

sed -i '/<p class="simpara">/{ N; s/<p class="simpara">\nShare your/\nShare your/ }'
    $TARGET/book.epub.d/OEBPS/apd.html
sed -i '/various ways:/{ N; s/various ways:\n<\/p>/various ways:\n/ }'
    $TARGET/book.epub.d/OEBPS/apd.html

Basically this script removes unnecessary <p class="simpara"> and it's closing </p> tag.

Similarly for numbered lists:

sed -i '/<p class="simpara">/{ N; s/<p class="simpara">\nPlease tell me/\nPlease tell me/ }'
    $TARGET/book.epub.d/OEBPS/ch10.html
sed -i '/myself):/{ N; s/myself):\n<\/p>/myself):\n/ }'
    $TARGET/book.epub.d/OEBPS/ch10.html

When creating mobi with Calibre you also need to make sure that the cover is visible. It all boils down to using some options when converting. See this bug thread to find out the details: https://bugs.launchpad.net/calibre/+bug/1012119

P.S. There are other (better?) ways to create a mobi files. Read https://github.com/akosmasoftware/eBook-Template and http://www.designtechstuff.com/2012/04/how-to-self-publish-kindle-book-s... for some inspiration.

Ok, that is it for now. The next (and probably the last) part will be about images.

 
 
 
This used to be my blog. I moved to http://tomek.kaczanowscy.pl long time ago.

 
 
 

Please comment using