Fix CFF CID-keyed font subsetting: missing .notdef glyph and incorrect Charset encoding#116
Open
55728 wants to merge 1 commit intoprawnpdf:masterfrom
Open
Fix CFF CID-keyed font subsetting: missing .notdef glyph and incorrect Charset encoding#11655728 wants to merge 1 commit intoprawnpdf:masterfrom
55728 wants to merge 1 commit intoprawnpdf:masterfrom
Conversation
When subsetting CID-keyed CFF fonts (e.g., NotoSerifCJK.ttc), the encoded CFF data contains structural errors that cause certain glyphs to not render in PDF viewers. Three bugs are fixed: 1. CharstringsIndex#encode_items omits the .notdef charstring when the charmap has no mapping for GID 0. The CFF spec requires .notdef at index 0; its absence shifts all charstring indices by one. 2. FdSelector#encode similarly omits the .notdef entry, causing a mismatch between charstring indices and Font Dict assignments. Glyphs end up referencing the wrong Font Dict's local subroutines, producing corrupt outlines. 3. Charset#encode passes unsorted SIDs to BinUtils.rangify, which assumes sorted input. In CID-keyed fonts SIDs ordered by new GID are not necessarily ascending, so rangify merges unrelated SIDs into incorrect ranges. The fix falls back to array format when SIDs are not sorted. Tested with NotoSerifCJK.ttc (65,535 glyphs, 18 Font Dicts). Verified correct rendering and CFF structure with fonttools. Ref: prawnpdf/prawn#1105
This was referenced Mar 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When subsetting CID-keyed CFF fonts (e.g., NotoSerifCJK.ttc), the generated PDF contains corrupted glyph data, causing certain CJK characters to not render in PDF viewers. The issue was originally reported in prawnpdf/prawn#1105.
While the original crash (
NoMethodErroronglyftable) was resolved in ttfunk 1.8.0, the generated PDFs still contain structural errors in the CFF data that cause glyph rendering failures.Root Cause
Three related bugs in the CFF subsetting code:
1.
CharstringsIndex#encode_items— missing .notdef charstringWhen the charmap does not include a mapping for GID 0 (
.notdef), the encoded CharstringsIndex omits the.notdefcharstring. The CFF specification requires.notdefto always be present at index 0. This causes all charstring indices to be off by one.2.
FdSelector#encode— missing .notdef FD entrySimilarly, the encoded FD selector omits the entry for GID 0, causing a mismatch between charstring indices and their Font Dict assignments. Glyphs end up referencing the wrong Font Dict's local subroutines, producing corrupted outlines.
3.
Charset#encode— incorrect range encoding for unsorted SIDsIn CID-keyed fonts, SIDs (String IDs) in new GID order are not necessarily in ascending order.
BinUtils.rangifyassumes sorted input and groups values whereb - a <= 1into ranges. When SIDs decrease (e.g.,[1549, 1509]), the difference is negative, which satisfies<= 1, causing unrelated SIDs to be incorrectly merged into a single range.Fix
.notdefcharstring (items[0]) when the charmap does not include GID 0..notdefFD entry ([0, self[0]]) when the charmap does not include GID 0.rangify. If not, fall back to array format encoding.Verification
Reproduction
Before this fix, some characters (e.g., こ, 世, テ) are invisible in the output PDF. After this fix, all characters render correctly.