Skip to content

Text reference format #20

@qnga

Description

@qnga

Context: I'm implementing a read aloud navigator for Android, speaking the publication content independently from any visual rendition. Besides separation of concerns, this enables background playback without any view. As HTML parsing is not trivial and possibly time-consuming, I expect the guided navigation documents to be the only source of content available to the component, so it should contain all the data needed.

We used to use various location formats in the Readium toolkits in different contexts and some are still emerging within the scope of the annotation specification. In the case of the read aloud (specifically TTS), we currently and quite successfully use a combination of CSS selector and textual context. The processing algorithm is as follows:

  • locate the first DOM element matching the CSS selector
  • within the scope of this DOM element, identify the text chunk matching the textual context

As any navigator, the read aloud navigator has to expose locations to enable various features such as synchronization with and highlighting within a visual rendition, bookmarking, etc.

The guided navigation spec is far more restrictive regarding the location format than what we're used to. A node rendered with TTS contains two kinds of data: a text object, containing plain text or SSML and a textref URL. Currently an URL can contain two kinds of data for text publications: element fragments (pointing to HTML elements with IDs) and text fragments. According to the URL Fragment Text Directive spec, the second ones take precedence over the first ones, so this is not the same algorithm than the one we're using in Readium and it is not compatible. Besides, general CSS selectors are not supported, nor are DOM ranges or any other format.

Are we conformable with going towards using text fragments only? I can think of two possible issues with them: performance and copyright. It's far more efficient to locate a DOM element by its ID and try to match text inside it only. Concerning copyright, outputs from the read aloud navigator could be persisted or exported as bookmarks, enabling exporting the whole text.

If we're not, textref could contain structured JSON to allow the use of our own custom formats independently of the evolution of the URL fragment standards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions