When is a page break not a page break? EPUB 3, page-list and pagebreak


Returning to the creation of ebooks after a break can be tricky; getting your head back around the concepts and the difficult to navigate IDPF documentation can induce moments of confusion. At one of these points, as I was styling some prelims, I reached out this week on twitter to clarify a point about page breaks, which I think might also confuse others because of misleading type names.

Explaining the confusion

It's standard practice to have a separate XHTML file for each chapter in an EPUB and it is most common for apps to break pages between these, but on occasion you might like to force a page break inside a single XHTML file. Perhaps because you want a single file for the prelims but want the half-title page, title page and copyright page, etc. to be on separate pages as they would be in a print book.

So it might seem logical to turn at this point to epub:type="pagebreak" but this is not an instruction for there to actually be a page break in the EPUB, instead it is a way of identifying page numbers from the print book within the electronic one in the hope that reading apps will utilise these page numbers for the benefit of readers.

The reason that the pagebreak type does not actually insert a page break is because an electronic page can vary in length depending on screen and text size, and breaking at every print page would create a lot of unnecessary white space interfering with the flow of "reflowable" EPUBs. But the question that follows is: why not call the type page-number instead of pagebreak? I have no idea, maybe they just wanted to add some confusion.

Identifying page numbers within an EPUB

If you wish to use the pagebreak type in order to identify page numbers within the XHTML of your EPUB, use the following code (at the start of the numbered page):
<span epub:type="pagebreak" title="3" id="p3"></span>
In addition, you must then add a page-list to your EPUB Navigation Document (which is most commonly saved with the filename toc.xhtml):
"If an EPUB publication includes page break markers it also requires a page list — a special navigation component in the EPUB navigation document that provides links to all the page break locations."
You can see the practical use of a pagebreak in this IDPF sample file and likewise there is also a page-list sample.

The page-list is one of the optional types contained in the EPUB Navigation Document, alongside a list of tables, identified with epub:type="lot", and landmarks. Note: the only type that the navigation document must contain is a toc (table of contents) type.

So how do I really insert a page break within an EPUB?

I'm glad you asked. Page breaks can be inserted into EPUBs with regular CSS like:
p.pagebreak {
page-break-after:always;
}
The CSS is included in the IDPF properties that are expected to work in all EPUB readers and the accompanying XHTML is equally simple.
<p class="pagebreak"></p>
Or you could add page-break-after or page-break-before within the CSS for a specific style that will always come at the beginning or end of a page, for example the book title.

While it would be tempting to simply add the pagebreak class to the print pagebreaks (epub:type="pagebreak") where electronic page breaks are actually required, yoy must consider the difference between block-level HTML tags and inline tags. To explain, a page may be begun in a print book within the middle of a paragraph (and so too in the automatic flow of an EPUB) but it would be unlikely that you'd want to insert a page break in the middle of a paragraph of HTML. So you'd likely be placing pagebreaks before paragraphs and other block-level elements, so the  electronic page break itself should really be a block-level element.

Closing remarks

The choice on whether to use pagebreak and page-list is up to you. It is also up to you on whether to mark every page break or only a few significant ones with "page-breaks", but the idea is to increase accessibility, and overall utility, and it is particularly useful for academic books, and other texts that will be referenced and/or read in group situations across print and electronic formats to have print page references available.

Comments