Skip to content

Commit d010815

Browse files
markbrucezhang_xiaoning
andauthored
fix: use dynamic count for beginbfrange declaration (#1660)
* fix: use dynamic count for beginbfrange declaration Fixes garbled text copying in Chrome/Edge for PDFs with >256 unique characters * Add changelog line Addressed an issue with garbled text copying in Chrome/Edge for PDFs containing more than 256 unique characters. * test: add tests for beginbfrange count declaration Add test cases to verify that the beginbfrange count declaration in ToUnicode CMap matches the actual number of bfrange entries. - Test for fonts with >256 characters (multiple ranges) - Test for fonts with <=256 characters (single range) These tests ensure the fix for the beginbfrange count bug is correct and prevent regression. Related to #1659 * Revert "test: add tests for beginbfrange count declaration" This reverts commit dda6f4a. * Rewrite tests for beginbfrange count declaraton. Fix code style issue. * fix(tests): remove unused variables in font.spec.js Replace for loops with unused match variables with spread operator to directly get array length, fixing ESLint no-unused-vars errors. - Replace loop counting with [...rangeMatches].length - Fixes ESLint errors at lines 152 and 197 in tests/unit/font.spec.js - All tests pass successfully --------- Co-authored-by: zhang_xiaoning <[email protected]>
1 parent 070f275 commit d010815

File tree

3 files changed

+105
-1
lines changed

3 files changed

+105
-1
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
### Unreleased
44

5+
- Fix garbled text copying in Chrome/Edge for PDFs with >256 unique characters (#1659)
6+
57
### [v0.17.2] - 2025-08-30
68

79
- Fix rendering lists that spans across pages

lib/font/embedded.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ begincmap
274274
1 begincodespacerange
275275
<0000><ffff>
276276
endcodespacerange
277-
1 beginbfrange
277+
${ranges.length} beginbfrange
278278
${ranges.join('\n')}
279279
endbfrange
280280
endcmap

tests/unit/font.spec.js

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,108 @@ describe('EmbeddedFont', () => {
9898

9999
expect(glyphs).toBe(398 + 1);
100100
});
101+
102+
test('beginbfrange count should match actual number of ranges', () => {
103+
const doc = new PDFDocument({ compress: false });
104+
const font = PDFFontFactory.open(
105+
doc,
106+
'tests/fonts/Roboto-Regular.ttf',
107+
undefined,
108+
'F1099',
109+
);
110+
111+
// Generate more than 256 unique characters to trigger multiple bfrange entries
112+
// Each chunk is 256 characters, so we need >256 to get multiple ranges
113+
const chars = [];
114+
115+
// Add ASCII characters (0-127)
116+
for (let i = 32; i < 127; i++) {
117+
chars.push(String.fromCharCode(i));
118+
}
119+
120+
// Add extended Latin characters (128-255)
121+
for (let i = 160; i < 256; i++) {
122+
chars.push(String.fromCharCode(i));
123+
}
124+
125+
// Add additional Unicode characters to exceed 256
126+
const additionalChars =
127+
'ÁÀÂÄÅÃÆÇÐÉÈÊËÍÌÎÏÑÓÒÔÖÕØŒÞÚÙÛÜÝŸáàâäãåæçðéèêëíìîïıñóòôöõøœßþúùûüýÿĀĂĄĆČĎĐĒĖĘĚĞĢĪĮİĶŁĹĻĽŃŅŇŌŐŔŖŘŠŚŞȘŢȚŤŪŮŰŲŽŹŻāăąćčďđēėęěğģīįķłĺļľńņňōőŕŗřšśşșţțťūůűųžźż';
128+
129+
const allChars = chars.join('') + additionalChars;
130+
font.encode(allChars);
131+
132+
const docData = logData(doc);
133+
font.toUnicodeCmap();
134+
const text = docData.map((d) => d.toString('utf8')).join('');
135+
136+
// Extract the count declaration from "N beginbfrange"
137+
const beginbfrangeMatch = text.match(/(\d+)\s+beginbfrange/);
138+
expect(beginbfrangeMatch).not.toBeNull();
139+
const declaredCount = parseInt(beginbfrangeMatch[1], 10);
140+
141+
// Count actual bfrange entries
142+
let actualRangeCount = 0;
143+
const bfrangeBlockMatch = text.match(
144+
/beginbfrange\n((?:.|\n)*?)\nendbfrange/,
145+
);
146+
if (bfrangeBlockMatch) {
147+
const bfrangeContent = bfrangeBlockMatch[1];
148+
// Match each bfrange line: <start> <end> [entries]
149+
const rangeMatches = bfrangeContent.matchAll(
150+
/^<([0-9a-f]+)>\s+<([0-9a-f]+)>\s+\[/gm,
151+
);
152+
actualRangeCount = [...rangeMatches].length;
153+
}
154+
155+
// The declared count must match the actual number of ranges
156+
expect(declaredCount).toBe(actualRangeCount);
157+
expect(actualRangeCount).toBeGreaterThan(1); // Should have multiple ranges when >256 chars
158+
});
159+
160+
test('beginbfrange count should be 1 for fonts with <=256 characters', () => {
161+
const doc = new PDFDocument({ compress: false });
162+
const font = PDFFontFactory.open(
163+
doc,
164+
'tests/fonts/Roboto-Regular.ttf',
165+
undefined,
166+
'F1099',
167+
);
168+
169+
// Generate exactly 256 characters
170+
const chars = [];
171+
for (let i = 0; i < 256; i++) {
172+
chars.push(String.fromCharCode(i + 32)); // Start from space (32) to avoid control chars
173+
}
174+
font.encode(chars.join(''));
175+
176+
const docData = logData(doc);
177+
font.toUnicodeCmap();
178+
const text = docData.map((d) => d.toString('utf8')).join('');
179+
180+
// Extract the count declaration
181+
const beginbfrangeMatch = text.match(/(\d+)\s+beginbfrange/);
182+
expect(beginbfrangeMatch).not.toBeNull();
183+
const declaredCount = parseInt(beginbfrangeMatch[1], 10);
184+
185+
// Count actual bfrange entries
186+
let actualRangeCount = 0;
187+
const bfrangeBlockMatch = text.match(
188+
/beginbfrange\n((?:.|\n)*?)\nendbfrange/,
189+
);
190+
if (bfrangeBlockMatch) {
191+
const bfrangeContent = bfrangeBlockMatch[1];
192+
const rangeMatches = bfrangeContent.matchAll(
193+
/^<([0-9a-f]+)>\s+<([0-9a-f]+)>\s+\[/gm,
194+
);
195+
actualRangeCount = [...rangeMatches].length;
196+
}
197+
198+
// For <=256 characters, should have exactly 1 range
199+
expect(declaredCount).toBe(1);
200+
expect(actualRangeCount).toBe(1);
201+
expect(declaredCount).toBe(actualRangeCount);
202+
});
101203
});
102204
});
103205

0 commit comments

Comments
 (0)