EPUBCheck – The official conformance checker for ePub publications

  • I worked for a publisher for about 10 years as a typesetter and ebook developer. There are a lot of things about the publishing industry that are antiquated, especially for non-technical publishing companies. Unfortunately it's a low margin business.

    Most authors are only familiar with Microsoft Word, so on the front end you often have to take a messily styled Word document and manually caress it into a structured document that can be used for ebooks and print.

    For print, a majority of non-technical publishers use Adobe InDesign and/or InCopy. Editors edit manuscripts in InCopy and typesetters style documents for print. PDFs are generally exported and sent to printers via FTP.

    For ebooks, every publisher seems to have their own bespoke system. You _can_ export books in epub format from InDesign but the process for getting a clean ebook is difficult to say the least since InDesign was primarily designed for print publications. Generally, you end up structuring books for the lowest common denominator of ebook platform (epub, kindle, etc.) unless you are creating something like a children's book or a poetry book where you might do something more custom.

    Many publishers use ebook distribution platforms where you upload epub, mobi, cover images via FTP. They use an XML standard called ONIX for distributing metadata that's unique to say the least...

  • Cutting and pasting an old 2019 comment https://news.ycombinator.com/item?id=19944627 :

    > There are similar problems with uploading to publishers in ePub format. The last time I was bashing my head against ebook publishing, about a couple of years ago, many (most? all?) of the sites were validating ePub uploads using an old version of the ePub suite which rejected some ebooks which were valid per the up-to-date validator. Which version they were using was ofc not documented, and you were lucky to even get to see an error message. And of course tech support was largely unhelpful. (Especially kobo.com 's.) The people working on the ePub spec seemed to be largely unbothered by the fragmentation/noncompliance and hideous experience for those authoring and uploading in the format, too.

    > Which is a pity, because aside from this and some other bugs and pitfalls EPUB 2.0 has some attractive features and is nice to work with for anyone who doesn't mind bashing out a good old directory tree of HTML docs by hand.

    Maybe things are a lot better by now. Here's hoping!

  • Why is this linked now? There's no new release or milestone at this time.

    Citing my comment from when this was new about eight months ago:

    > Even more unfortunate is that this change has already spilled to derived standards such as EPUB3 which hence makes existing EPUB3 content using compound headings going back to 2011 invalid, and EPUB3 writers lacking a tool for actually verifying what readers can support (epubcheck was blindly updated without consideration for the installed base).

    See also the blog [1] about W3C's most recent HTML spec. Lack of HTML backward compat along with gross import of all of CSS without profiles, or paged media requirements and deemphasis of long-standing EPub mechanisms in favor of CSS and JS, and general impression of a low-effort, merely editorial nature really makes Epub's move to W3C questionable but nobody seems to care anyway, sticking with EPub 2 and 3.1 (which is also what Calibre is recommending as target format for conversion).

    [1]: https://sgmljs.net/blog/blog2303.html

  • One can also $(brew install epubcheck) if so inclined https://formulae.brew.sh/formula/epubcheck#default

  • Last commit was in 2023: https://github.com/w3c/epubcheck/commits/main/

    85 open bugs; 9 open PRs.

  • I'm not sure why but when I download some epubs and try to send to my kindle it fails. Only after using an online converter to convert them from epub to epub does it then work.

  • Tried to publish an epub to German platform Tolino, tried everything, tried an external service agency, etc. doctored around in Calibre, no success, they didn't accept the epub.

    Printed PDF for decades at print shops, never had a problem.

    Why is this such a problem? Because of the HTML?JS?

  • Genuinely thankful for this tool and use it for both sides of my non-fiction book project (research: verifying converted or misbehaving epubs for use on Remarkable, iPad, Calibre and Kindle (I know, I know...)) as well as typesetting and review.