Skip to content

epub formatter might need to resolve some named character references #2111

Open
@adamwight

Description

@adamwight

If a markdown file in "extras" contains named entities like   then the output xhtml will fail epubcheck with a fatal error:

FATAL(RSC-016): ./ymlr.epub/OEBPS/content/benchmark.xhtml(70,45): Fatal Error while parsing file: The entity "nbsp" was referenced, but not declared.

The w3c validator agrees that this xhtml is invalid, saying

Fatal Error: reference to undeclared general entity nbsp

I can't make out how all the different specifications intersect here, but it makes sense that epub reading systems might only support a limited subset of all named character references. XML with no entity modules has only a handful of named entities,   is defined in HTMLlat1, and at the other end of the scale, current recommendations are to pull in the htmlmathml-f entities.

Since EPUB can't reach out to external resources, in order to support named entities it seems that a fully-compliant epub container would have to either inline this entity set (if this is allowed?), or embed only the necessary entities literally.

Another simpler alternative to shipping named entities would be to resolve all named character references into the numeric form, ie.   ->  .

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions