Skip to content

Fix CFF CID-keyed font subsetting: missing .notdef glyph and incorrect Charset encoding#116

Open
55728 wants to merge 1 commit intoprawnpdf:masterfrom
55728:fix-cff-fd-selector-off-by-one
Open

Fix CFF CID-keyed font subsetting: missing .notdef glyph and incorrect Charset encoding#116
55728 wants to merge 1 commit intoprawnpdf:masterfrom
55728:fix-cff-fd-selector-off-by-one

Conversation

@55728
Copy link

@55728 55728 commented Mar 23, 2026

Problem

When subsetting CID-keyed CFF fonts (e.g., NotoSerifCJK.ttc), the generated PDF contains corrupted glyph data, causing certain CJK characters to not render in PDF viewers. The issue was originally reported in prawnpdf/prawn#1105.

While the original crash (NoMethodError on glyf table) was resolved in ttfunk 1.8.0, the generated PDFs still contain structural errors in the CFF data that cause glyph rendering failures.

Root Cause

Three related bugs in the CFF subsetting code:

1. CharstringsIndex#encode_items — missing .notdef charstring

When the charmap does not include a mapping for GID 0 (.notdef), the encoded CharstringsIndex omits the .notdef charstring. The CFF specification requires .notdef to always be present at index 0. This causes all charstring indices to be off by one.

2. FdSelector#encode — missing .notdef FD entry

Similarly, the encoded FD selector omits the entry for GID 0, causing a mismatch between charstring indices and their Font Dict assignments. Glyphs end up referencing the wrong Font Dict's local subroutines, producing corrupted outlines.

3. Charset#encode — incorrect range encoding for unsorted SIDs

In CID-keyed fonts, SIDs (String IDs) in new GID order are not necessarily in ascending order. BinUtils.rangify assumes sorted input and groups values where b - a <= 1 into ranges. When SIDs decrease (e.g., [1549, 1509]), the difference is negative, which satisfies <= 1, causing unrelated SIDs to be incorrectly merged into a single range.

Fix

  • CharstringsIndex: Prepend the .notdef charstring (items[0]) when the charmap does not include GID 0.
  • FdSelector: Prepend the .notdef FD entry ([0, self[0]]) when the charmap does not include GID 0.
  • Charset: Check if SIDs are in ascending order before calling rangify. If not, fall back to array format encoding.

Verification

  • Tested with NotoSerifCJK.ttc (CID-keyed CFF font with 65,535 glyphs and 18 Font Dicts)
  • All CJK characters render correctly in Chrome after the fix
  • CFF structure validated with Python fonttools: all charstrings decompile and draw without errors
  • FdSelector GID→FD mappings verified correct for all glyphs
  • Charset CID names verified correct for all glyphs

Reproduction

require 'prawn'

Prawn::Document.new {
  font_families.update('NotoSerifCJK' => {
    normal: { file: '/path/to/NotoSerifCJK.ttc', font: 10 }
  })
  font 'NotoSerifCJK', size: 20
  text 'こんにちは世界テスト'
}.render_file 'output.pdf'

Before this fix, some characters (e.g., こ, 世, テ) are invisible in the output PDF. After this fix, all characters render correctly.

When subsetting CID-keyed CFF fonts (e.g., NotoSerifCJK.ttc), the
encoded CFF data contains structural errors that cause certain glyphs
to not render in PDF viewers.

Three bugs are fixed:

1. CharstringsIndex#encode_items omits the .notdef charstring when
   the charmap has no mapping for GID 0. The CFF spec requires
   .notdef at index 0; its absence shifts all charstring indices by
   one.

2. FdSelector#encode similarly omits the .notdef entry, causing a
   mismatch between charstring indices and Font Dict assignments.
   Glyphs end up referencing the wrong Font Dict's local
   subroutines, producing corrupt outlines.

3. Charset#encode passes unsorted SIDs to BinUtils.rangify, which
   assumes sorted input. In CID-keyed fonts SIDs ordered by new GID
   are not necessarily ascending, so rangify merges unrelated SIDs
   into incorrect ranges. The fix falls back to array format when
   SIDs are not sorted.

Tested with NotoSerifCJK.ttc (65,535 glyphs, 18 Font Dicts).
Verified correct rendering and CFF structure with fonttools.

Ref: prawnpdf/prawn#1105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant