Skip to main content


Showing posts from February, 2013

Projects for programmatically creating EPUBs on the fly

EeePub "EeePub is a Ruby ePub generator." EBook-EPUB "EBook::EPUB perl module for generating EPUB document" PHPePub "PHPePub allows a php script to generate ePub Electronic books on the fly, and send them to the user as downloads." python-epub-builder "This is a small python library for programmatically building EPUB books. It provides API for easily constructing and packaging EPUBs from text content." Building ePub with PHP and Markdown Follow @sketchytech

Translating EPUB3 metadata into JSON

From the EPUB3 spec ...  The following example shows how the complex title "The Great Cookbooks of the World: Mon premier guide de cuisson, un Mémoire. The New French Cuisine Masters, Volume Two. Special Anniversary Edition" could be classified. <metadata xmlns:dc="">     <dc:title id="t1" xml:lang="fr">Mon premier guide de cuisson, un Mémoire</dc:title>     <meta refines="#t1" property="title-type">main</meta>     <meta refines="#t1" property="display-seq">2</meta>       <dc:title id="t2">The Great Cookbooks of the World</dc:title>     <meta refines="#t2" property="title-type">collection</meta>     <meta refines="#t2" property="display-seq">1</meta>       <dc:title id="t3">The New French Cuisine Masters</dc:title&g

Going glocal with metadata: JSON and the Book of the Future

This article is an extension (and evolution) of earlier posts on this topic: see ' Why JSON as an article and book format? ' and ' A list of rules for a JSON book - a manifesto of sorts '. Going glocal with metadata Let's suppose we have our book (in JSON format) and it is single authored. We can place the author within the top level metadata object, because whichever language it is in the author remains the same. {     "metadata": {         "author": "Jim Smith",         "imprint": "self"     },     "en": [],     "fr": [],     "de": [] } But it is likely there will be a translator specific (local) to each language version, and the English version in this instance has no translator at all. Further, the same imprint might publish the book or languages might have their own imprints. Here there is a global imprint that is overridden by the local instance of the book in

Thinking ebooks outside the HTML/CSS box

Whenever a new technology is released it is, out of necessity, hyped to the maximum to ensure adoption, and if you're not taking this into account when thinking about ebooks, then you'd be forgiven for believing that HTML5, CSS3, and EPUB3, are the now and the future of ebooks. But I'm going to let you into a secret: there's plenty that these formats still omit, and for which they aren't perfect, and there'll be future iterations and updates that at times change things at a code level and will require the renewal of files to meet new standards. 'Drat!' I hear you say, 'I thought I'd finished messing around and could get on with new books not fixing old ones.' Well if we took a step outside the HTML/CSS box, then perhaps we could stop all the fiddling around and get on with the work of publishing. (I'm talking here about treating the book as data and delivering via an API,  see earlier posts on this subject ). How so? Imagine havi

Vive la difference: Why all digital books should look the same

I started my publishing career at Taylor & Francis when they were much smaller than they are today. Back then they used to have a system for books, particularly conference proceedings, that they didn't expect to sell very well. The book editors would supply the books as CRC (camera-ready copy) and try as well as they could to have contributors print (type?) the chapters in fonts and type sizes that most closely matched one another. All we did with the text, in the production department, was send the contributors glossy paper and then briefly check before passing on to print. As you can imagine these things were a hotchpotch of referencing, formatting and so on. They went against the grain, because our work was all about consistency, and it broke our hearts to seem them go out. What has that got to do with now? The type of inconsistency we see today inside ereaders capable of displaying rich content mirrors these CRC books. I'm not talking about the internals of indivi

Why JSON as an article and book format?

When we create an article in JSON we're not simply stripping the closing tags from everything. We can also, for example, remove tags such as the <p> tags required in blockquotes, because the insertion of these can be implied. If the parser we use to process the JSON finds a single array within a blockquote it knows there are not multiple paragraphs: {"blockquote":["Single paragraph."]} If the parser introspects and finds an array of arrays, it knows that each is a paragraph: {"blockquote":[["First paragraph."], ["Second paragraph."]]} Either can have emphasised text, citations, footnotes, etc. { "blockquote": [ [ "First paragraph ", { "i": "with italic text" }, "." ], [ "Second paragraph with a footnote.", { "fn": "This is the footnote." } ] ] } The equivalent to the above in HTML5/EPUB3 would

if ( publishing == web ) { return future_book; }

'What are you looking so nervous about?' said the surgeon. She was cutting open my wife to deliver our third child by Caesarian section. When Jeff Jaffe declared to the world at TOC 2013 that web = publishing and publishing = web, he expected that the world of publishing would echo back his phrase only moments later. But the world of publishing doesn't move so quickly. It's not because we're slow to catch on. We understand the concept, but we are also aware of what such a statement leaves out, or has the potential to leave out if we shake hands before the terms and conditions of the deal are agreed. I remember working in publishing production departments in the 1990s and being frustrated by the computing departments taking for granted what we needed when it came to tech and being dismissive of the need for us to be able to read, for example, Mac-formatted disks and Zip disks received from designers, copy-editors and typesetters. This feeling of being ridden

Hashtags not Hyperlinks: The index of the future

The present (hyperlinking) At the moment there are two typical approaches to hyperlinking indexes inside ebooks, the first is to mark each "page" as a it corresponds with the printed book (or PDF) and then link to these page markers from the index. The alternative is to link to specific paragraphs or even words. How the hashtag would work An index built with JSON, and controlled with JavaScript and RegEx, could be built in a number of ways. This is because we could do a number of things. The first of which would be to have the indexer compile a list of words and then for these words to be searched for across the entire book when tapped, and for an in-place list of extracts returned. Spot the difference This first method is little different from the reader typing a word in the search box, the difference being that the words are known to exist in the book and so the reader doesn't have to guess at them. Refining the search The disadvantage of the first method is

Go forth and API: Three Steps to Monetising the Book of the Future

1. Enable developers to access, via an API key, preview versions of every book you publish for their (web) apps These previews will include metadata and index, all chapter headings, a short amount of text, etc. The index will allow online discovery of further content, enough for the potential purchaser to evaluate the book. The API key will help track and identify the developer, so that you can record the level of traffic and revenue they generate. 2. Allow the user of a developer's app to purchase the entire text through signing into the publisher's website within the (web) app Doing so through the publisher's website will mean that not only will the user have access to the book within any app they use that is linked to the API, but the developer will receive a percentage of the purchase price. An alternative to this system would be one in which the developer can choose to be billed for user purchases, this would be useful in situations such as in-app purchase where t

A list of rules for a JSON book - a manifesto of sorts

This post breaks down at a code level a structure for JSON books and chapters. For a clear explanation of why a JSON book and article format would be a good thing see Why JSON as an article and book article format? 1. The root values of a book written using JavaScript Object Notation (JSON) should be the language(s) {     "EN": {},     "FR": {},     "DE": {} } 2. Inside the languages will be the editions {   "EN": {       "1980": {},       "2002": {}   },   "FR": {       "2002": {}   },   "DE": {       "1970": {},       "2009": {}   } } 3. Inside each edition will be the short titles of the chapters, which will be an array – this array can hold objects {}, strings "", and arrays [] {     "EN": {         "1980": {             "Chapter 1": [],             "Chapter 2": []         },