Skip to content

hyperref and hyperxmp metadata improvements#2365

Open
xworld21 wants to merge 4 commits intobrucemiller:masterfrom
xworld21:hyperxmp
Open

hyperref and hyperxmp metadata improvements#2365
xworld21 wants to merge 4 commits intobrucemiller:masterfrom
xworld21:hyperxmp

Conversation

@xworld21
Copy link
Copy Markdown
Contributor

Add remaining \hypersetup keywords and add xml:lang if pdfmetalang or pdflang were specified (and LaTeXML should probably use pdflang as document language if not already specified elsewhere). Note that this also fixes the wrong mapping of pdfsubject – it should be dcterms:description not dcterms:subject.

The metadata could then be used elsewhere (e.g. JATS #2354, EPUB), but this PR is only about preserving the data in the XML output.

Note that the implementation is incomplete: certain properties (such as authors) are implemented as lists by hyperxmp (<rdf:Seq><rdf:li>..., sometimes <rdf:Bag><rdf:li>..., or <rdf:Alt><rdf:li> for alternative languages). I haven't touched any of that as LaTeXML implements RDF essentially as strings.

@brucemiller
Copy link
Copy Markdown
Owner

This all looks plausible; I don't use XMP myself, but this should probably help those who do. @dginev do you have any thoughts on this? (other than not being an XMP fan, as I recall :> )

@dginev
Copy link
Copy Markdown
Collaborator

dginev commented May 29, 2024

I have a classic comment: it would be nice to have a test, which will have the additional effect of self-documenting how these XMP enhancements can be used by authors. I haven't used hyperxmp.sty myself, so I trust @xworld21 knows more than me here.

@dginev
Copy link
Copy Markdown
Collaborator

dginev commented May 29, 2024

I am also new to PRISM, but so far I find it palatable - there is a W3C Member Submission with a reasonable W3C Team Comment.

One observation while quickly skimming that, the PRISM Namespaces section lists the URI added in this PR as a basic: prefix, as there appear to be another ~10 other PRISM-related namespace URIs. Maybe we want to qualify that in latexml as prismbasic: / pbasic: or such?

@xworld21
Copy link
Copy Markdown
Contributor Author

xworld21 commented Jun 1, 2024

it would be nice to have a test

Excellent idea, and it's making me discover details I'd missed (e.g. only some properties respect pdfmetalang, others support a language by prefixing [en]).

Is it ok if I include the sample document included in the hyperxmp documentation? Would there be license issues with that?

@xworld21 xworld21 force-pushed the hyperxmp branch 5 times, most recently from 418c639 to 1fe9d9a Compare June 1, 2024 20:08
Copy link
Copy Markdown
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my end.

dginev added a commit to arXiv/LaTeXML that referenced this pull request Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants