Fix font subset tag generation according to PDF spec#107
Fix font subset tag generation according to PDF spec#107matsud224 wants to merge 1 commit intoprawnpdf:masterfrom
Conversation
According to PDF spec section 5.5.3, font subset tags must be 6 uppercase letters. This implements proper BASE25 encoding (using letters B-Z with A for padding) similar to Apache PDFBox, instead of the flawed character-by- character modulo approach in PR prawnpdf#107. This fixes both Adobe Acrobat (issue prawnpdf#102) and Adobe Illustrator warnings while maintaining proper text encoding and ToUnicode mapping. Fixes: prawnpdf#102 Related: prawnpdf#107 Made-with: Cursor
According to PDF spec section 5.5.3, font subset tags must be 6 uppercase letters. This converts the hex digest (0-9a-f) to uppercase letters: - Digits 0-9 map to letters A-J - Hex a-f map to letters K-P This maintains 1-to-1 deterministic mapping while satisfying the PDF spec requirement for 6 uppercase letters. Fixes both Adobe Acrobat (issue prawnpdf#102) and Adobe Illustrator warnings. Fixes: prawnpdf#102 Related: prawnpdf#107 Made-with: Cursor
|
Hi @matsud224, thanks for raising this issue! I noticed @diaconu-andrei has been working on a more comprehensive fix in their fork (1afbda4, 4e6177c) that addresses both Acrobat and Illustrator compatibility — including platform-aware PostScript name records and Mac Roman cmap subtable generation. @diaconu-andrei, would you consider opening a PR for your changes? They look like a solid improvement over the current approach. (For context, I'm working on CFF CID-keyed font subsetting fixes in #116, and proper subset tags would complement those changes nicely.) |
|
According to the PDF 1.7 specification section 5.5.3 "Font Subsets", tag of a subset font should be 6 uppercase letters (e.g. ABCDEF).
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf
In the current implementation, it is possible for tags to contain numbers and lowercase letters.