Skip to content

Encode omissions as gap with unit & quantity#369

Merged
joewiz merged 2 commits intomasterfrom
add-gap-pm
Aug 12, 2025
Merged

Encode omissions as gap with unit & quantity#369
joewiz merged 2 commits intomasterfrom
add-gap-pm

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented Jul 1, 2025

As reported by @awmarrs, the text immediately preceding footnotes 3 and 5 in https://history.state.gov/historicaldocuments/frus1981-88v38/d70 are blank gaps. The gaps are present in the PDF:

Screenshot 2025-07-01 at 16 56 29

... but were not encoded in the XML, leading to omission of the space.

This PR fixes these 2 instances, using our ODD's existing support for <gap> with @unit and @quantity. I specified @reason="lacuna", following C.M. Sperberg McQueen's suggestion from TEI-L (Feb 20, 2019) to indicate that the gap was in the original.

At some point we should tidy up existing ~300 instances of <gap> in the FRUS corpus to conform to the @unit values listed in the Guidelines and align with the tei.enumerated class for @reason: https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-gap.html.

Corresponding updates to the FRUS processing model in hsg-shell are in HistoryAtState/hsg-shell#528.

@joewiz joewiz changed the title Encode whitespace as gap with unit & quantity Encode omissions as gap with unit & quantity Jul 1, 2025
@joewiz joewiz merged commit 0a66bf3 into master Aug 12, 2025
@joewiz joewiz deleted the add-gap-pm branch August 12, 2025 17:49
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented Aug 12, 2025

🎉 This PR is included in version 0.8.3 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant