Skip to content

Add tagged PDF infrastructure for Section 508 accessibility#67

Open
craigmcnamara wants to merge 4 commits intoprawnpdf:masterfrom
mes-amis:section-508-accessability
Open

Add tagged PDF infrastructure for Section 508 accessibility#67
craigmcnamara wants to merge 4 commits intoprawnpdf:masterfrom
mes-amis:section-508-accessability

Conversation

@craigmcnamara
Copy link
Copy Markdown

Summary

Adds the low-level foundation for generating tagged (accessible) PDFs compliant with GSA Section 508 guidelines.

  • PDF::Core::MarkedContent module — emits BMC/BDC/EMC operators in content streams for marking content sequences
  • PDF::Core::StructureTree class — manages the StructTreeRoot, structure elements (/StructElem), ParentTree (number tree), and per-page MCID allocation. Finalizes automatically via before_render callback.
  • ObjectStore — accepts marked: true option, sets /MarkInfo << /Marked true >> on the Catalog
  • DocumentState — threads the marked option through to ObjectStore
  • Renderer — includes MarkedContent, creates StructureTree when marked, sets minimum PDF version to 1.7, exposes structure_tree and marked? accessors
  • Supports /Alt, /ActualText, /Lang, and /Scope attributes on structure elements

This is the foundation layer that prawnpdf/prawn and prawnpdf/prawn-table build on for high-level accessibility APIs. See companion PRs there.

Addresses prawnpdf/prawn-table#78

Test plan

  • 26 new specs covering marked content operators, structure tree creation, MCID allocation, element nesting, artifact marking, ParentTree building, and full render round-trip
  • All 139 specs pass (113 existing + 26 new)
  • Generate a tagged PDF and verify in Adobe Acrobat (Tagged PDF: Yes, structure tree visible in Tags panel)
  • Run PAC 2024 accessibility checker

🤖 Generated with Claude Code

craigmcnamara and others added 3 commits March 25, 2026 15:22
Adds marked content operators (BMC/BDC/EMC) and structure tree support
to pdf-core, enabling accessible/tagged PDF generation. This is the
foundation layer that Prawn and Prawn::Table will build on.

New modules:
- PDF::Core::MarkedContent — emits BMC/BDC/EMC operators in content streams
- PDF::Core::StructureTree — manages StructTreeRoot, structure elements,
  ParentTree, and MCID allocation

Modified:
- ObjectStore: accepts marked: true, sets /MarkInfo on Catalog
- DocumentState: threads marked option through to ObjectStore
- Renderer: includes MarkedContent, creates StructureTree when marked,
  registers before_render callback for structure tree finalization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allows setting /ActualText on structure elements so screen readers
announce replacement text instead of reading visual characters literally
(e.g., "required" instead of "asterisk" for *, "selected" for X).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant