Skip to content

Commit 08ba10e

Browse files
committed
A whole bunch of CFF fixes
## Corrupted CFF index data There was a subtle bug in CFF Index implementation that resulted in a data corruption. In certain circumstances some items didn't get properly encoded. This happened when items were not previously accessed. This resulted, for instance, in missing glyphs. But only sometimes because indexes might've still contain data that shouldn't've been there. In combination with incorrect encoding (see further) this resulted in some glyphs still being rendered, sometimes even correctly. Along with the fix a rather large API change landed. This resulted in quite a big diff. ## Incorrect CFF encoding in subsets TTFunk used to reuse encoding from the original font. This mapping was incorrect for subset fonts which used not just a subset of glyphs but also a different encoding. A separate issue was that some fonts have empty CFF encoding. This incorrect mapping resulted in encoding that mapped all codes to glyph 0. This had impact on Prawn in particular. PDF spec explicitly says that CFF encoding is not to be used in OpenType fonts. `cmap` table should directly index charstrings in the CFF table. Despite this PDF renderers still use CFF encoding to retrieve glyphs. So TTFunk has to discard the original CFF encoding and supply its own.
1 parent acddf19 commit 08ba10e

26 files changed

+401
-232
lines changed

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,41 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/).
77

88
## [Unreleased]
99

10+
### Fixed
11+
12+
* Corrupted CFF index data
13+
14+
there was a subtle bug in cff index implementation that resulted in
15+
a data corruption. in certain circumstances some items didn't get
16+
properly encoded. this happened when items were not previously accessed.
17+
18+
this resulted, for instance, in missing glyphs. but only sometimes
19+
because indexes might've still contain data that shouldn't've been
20+
there. in combination with incorrect encoding (see further) this
21+
resulted in some glyphs still being rendered, sometimes even correctly.
22+
23+
along with the fix a rather large api change landed. this resulted in
24+
quite a big diff.
25+
26+
Alexander Mankuta
27+
28+
* Incorrect CFF encoding in subsets
29+
30+
TTFunk used to reuse encoding from the original font. This mapping was
31+
incorrect for subset fonts which used not just a subset of glyphs but
32+
also a different encoding.
33+
34+
A separate issue was that some fonts have empty CFF encoding. This
35+
incorrect mapping resulted in encoding that mapped all codes to glyph 0.
36+
37+
This had impact on Prawn in particular. PDF spec explicitly says that
38+
CFF encoding is not to be used in OpenType fonts. `cmap` table should
39+
directly index charstrings in the CFF table. Despite this PDF renderers
40+
still use CFF encoding to retrieve glyphs. So TTFunk has to discard the
41+
original CFF encoding and supply its own.
42+
43+
Alexander Mankuta
44+
1045
## 1.7.0
1146

1247
### Changes

lib/ttfunk/otf_encoder.rb

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ def base_table
2727
end
2828

2929
def cff_table
30-
@cff_table ||= original.cff.encode(new_to_old_glyph, old_to_new_glyph)
30+
@cff_table ||= original.cff.encode(subset)
3131
end
3232

3333
def vorg_table
@@ -48,14 +48,5 @@ def optimal_table_order
4848
(tables.keys - ['DSIG'] - OPTIMAL_TABLE_ORDER) +
4949
['DSIG']
5050
end
51-
52-
def collect_glyphs(glyph_ids)
53-
# CFF top indexes are supposed to contain only one font, although they're
54-
# capable of supporting many (no idea why this is true, maybe for CFF
55-
# v2??). Anyway it's cool to do top_index[0], don't worry about it.
56-
glyph_ids.each_with_object({}) do |id, h|
57-
h[id] = original.cff.top_index[0].charstrings_index[id]
58-
end
59-
end
6051
end
6152
end

lib/ttfunk/subset/code_page.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ def initialize(original, code_page, encoding)
4040

4141
def to_unicode_map
4242
self.class.unicode_mapping_for(encoding)
43+
.select { |codepoint, _unicode| @subset[codepoint] }
4344
end
4445

4546
def use(character)

lib/ttfunk/table/cff.rb

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -31,18 +31,18 @@ def tag
3131
TAG
3232
end
3333

34-
def encode(new_to_old, old_to_new)
34+
def encode(subset)
3535
EncodedString.new do |result|
36-
sub_tables = [
36+
result.concat(
3737
header.encode,
3838
name_index.encode,
39-
top_index.encode(&:encode),
39+
top_index.encode,
4040
string_index.encode,
4141
global_subr_index.encode
42-
]
42+
)
4343

44-
sub_tables.each { |tb| result << tb }
45-
top_index[0].finalize(result, new_to_old, old_to_new)
44+
charmap = subset.new_cmap_table[:charmap]
45+
top_index[0].finalize(result, charmap)
4646
end
4747
end
4848

lib/ttfunk/table/cff/charset.rb

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ def strings_for_charset_id(charset_id)
3535
end
3636

3737
attr_reader :entries, :length
38-
attr_reader :top_dict, :format, :count, :offset_or_id
38+
attr_reader :top_dict, :format, :items_count, :offset_or_id
3939

4040
def initialize(top_dict, file, offset_or_id = nil, length = nil)
4141
@top_dict = top_dict
@@ -44,15 +44,15 @@ def initialize(top_dict, file, offset_or_id = nil, length = nil)
4444
if offset
4545
super(file, offset, length)
4646
else
47-
@count = self.class.strings_for_charset_id(offset_or_id).size
47+
@items_count = self.class.strings_for_charset_id(offset_or_id).size
4848
end
4949
end
5050

5151
def each
5252
return to_enum(__method__) unless block_given?
5353

5454
# +1 adjusts for the implicit .notdef glyph
55-
(count + 1).times { |i| yield self[i] }
55+
(items_count + 1).times { |i| yield self[i] }
5656
end
5757

5858
def [](glyph_id)
@@ -73,13 +73,18 @@ def offset
7373
end
7474
end
7575

76-
# mapping is new -> old glyph ids
77-
def encode(mapping)
76+
def encode(charmap)
7877
# no offset means no charset was specified (i.e. we're supposed to
7978
# use a predefined charset) so there's nothing to encode
8079
return '' unless offset
8180

82-
sids = mapping.keys.sort.map { |new_gid| sid_for(mapping[new_gid]) }
81+
sids =
82+
charmap
83+
.values
84+
.reject { |mapping| mapping[:new].zero? }
85+
.sort_by { |mapping| mapping[:new] }
86+
.map { |mapping| sid_for(mapping[:old]) }
87+
8388
ranges = TTFunk::BinUtils.rangify(sids)
8489
range_max = ranges.map(&:last).max
8590

@@ -138,7 +143,7 @@ def find_string(sid)
138143

139144
idx = sid - 390
140145

141-
if idx < file.cff.string_index.count
146+
if idx < file.cff.string_index.items_count
142147
file.cff.string_index[idx]
143148
end
144149
else
@@ -153,23 +158,23 @@ def parse!
153158

154159
case format_sym
155160
when :array_format
156-
@count = top_dict.charstrings_index.count - 1
157-
@length = count * element_width
161+
@items_count = top_dict.charstrings_index.items_count - 1
162+
@length = @items_count * element_width
158163
@entries = OneBasedArray.new(read(length, 'n*'))
159164

160165
when :range_format8, :range_format16
161166
# The number of ranges is not explicitly specified in the font.
162167
# Instead, software utilizing this data simply processes ranges
163168
# until all glyphs in the font are covered.
164-
@count = 0
169+
@items_count = 0
165170
@entries = []
166171
@length = 0
167172

168-
until count >= top_dict.charstrings_index.count - 1
173+
until @items_count >= top_dict.charstrings_index.items_count - 1
169174
@length += 1 + element_width
170175
sid, num_left = read(element_width, element_format)
171-
entries << (sid..(sid + num_left))
172-
@count += num_left + 1
176+
@entries << (sid..(sid + num_left))
177+
@items_count += num_left + 1
173178
end
174179
end
175180
end

lib/ttfunk/table/cff/charstring.rb

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -91,10 +91,6 @@ def render(x: 0, y: 0, font_size: 72)
9191
)
9292
end
9393

94-
def encode
95-
raw
96-
end
97-
9894
private
9995

10096
def parse!

lib/ttfunk/table/cff/charstrings_index.rb

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,21 @@ def initialize(top_dict, *remaining_args)
1111
@top_dict = top_dict
1212
end
1313

14-
def [](index)
15-
entry_cache[index] ||= TTFunk::Table::Cff::Charstring.new(
14+
private
15+
16+
def decode_item(index, _offset, _length)
17+
TTFunk::Table::Cff::Charstring.new(
1618
index, top_dict, font_dict_for(index), super
1719
)
1820
end
1921

20-
# gets passed a mapping of new => old glyph ids
21-
def encode(mapping)
22-
super() do |_entry, index|
23-
self[mapping[index]].encode if mapping.include?(index)
24-
end
22+
def encode_items(charmap)
23+
charmap
24+
.reject { |code, mapping| mapping[:new].zero? && !code.zero? }
25+
.sort_by { |_code, mapping| mapping[:new] }
26+
.map { |(_code, mapping)| items[mapping[:old]] }
2527
end
2628

27-
private
28-
2929
def font_dict_for(index)
3030
# only CID-keyed fonts contain an FD selector and font dicts
3131
if top_dict.is_cid_font?

lib/ttfunk/table/cff/encoding.rb

Lines changed: 24 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -22,24 +22,26 @@ def codes_for_encoding_id(encoding_id)
2222
end
2323
end
2424

25-
attr_reader :top_dict, :format, :count, :offset_or_id
25+
attr_reader :top_dict, :format, :items_count, :offset_or_id
2626

2727
def initialize(top_dict, file, offset_or_id = nil, length = nil)
2828
@top_dict = top_dict
2929
@offset_or_id = offset_or_id || DEFAULT_ENCODING_ID
3030

3131
if offset
3232
super(file, offset, length)
33+
@supplemental = format >> 7 == 1
3334
else
34-
@count = self.class.codes_for_encoding_id(offset_or_id).size
35+
@items_count = self.class.codes_for_encoding_id(offset_or_id).size
36+
@supplemental = false
3537
end
3638
end
3739

3840
def each
3941
return to_enum(__method__) unless block_given?
4042

4143
# +1 adjusts for the implicit .notdef glyph
42-
(count + 1).times { |i| yield self[i] }
44+
(items_count + 1).times { |i| yield self[i] }
4345
end
4446

4547
def [](glyph_id)
@@ -62,16 +64,18 @@ def offset
6264
end
6365
end
6466

65-
def encode(new_to_old, old_to_new)
66-
# no offset means no encoding was specified (i.e. we're supposed to
67-
# use a predefined encoding) so there's nothing to encode
68-
return '' unless offset
69-
return encode_supplemental(new_to_old, old_to_new) if supplemental?
67+
def encode(charmap)
68+
# Any subset encoding is all but guaranteed to be different from the
69+
# standard encoding so we don't even attempt to see if it matches. We
70+
# assume it's different and just encode it anew.
71+
72+
return encode_supplemental(charmap) if supplemental?
7073

7174
codes =
72-
new_to_old.keys.sort.map do |new_gid|
73-
code_for(new_to_old[new_gid])
74-
end
75+
charmap
76+
.reject { |_code, mapping| mapping[:new].zero? }
77+
.sort_by { |_code, mapping| mapping[:new] }
78+
.map { |(code, _m)| code }
7579

7680
ranges = TTFunk::BinUtils.rangify(codes)
7781

@@ -95,18 +99,16 @@ def encode(new_to_old, old_to_new)
9599

96100
def supplemental?
97101
# high-order bit set to 1 indicates supplemental encoding
98-
@format >> 7 == 1
102+
@supplemental
99103
end
100104

101105
private
102106

103-
def encode_supplemental(_new_to_old, old_to_new)
107+
def encode_supplemental(charmap)
104108
new_entries =
105-
@entries.each_with_object({}) do |(code, old_gid), ret|
106-
if (new_gid = old_to_new[old_gid])
107-
ret[code] = new_gid
108-
end
109-
end
109+
charmap
110+
.reject { |_code, mapping| mapping[:new].zero? }
111+
.transform_values { |mapping| mapping[:new] }
110112

111113
result = [format_int(:supplemental), new_entries.size].pack('CC')
112114
fmt = element_format(:supplemental)
@@ -150,22 +152,22 @@ def parse!
150152

151153
case format_sym
152154
when :array_format
153-
@count = entry_count
155+
@items_count = entry_count
154156
@entries = OneBasedArray.new(read(length, 'C*'))
155157

156158
when :range_format
157159
@entries = []
158-
@count = 0
160+
@items_count = 0
159161

160162
entry_count.times do
161163
code, num_left = read(element_width, element_format)
162164
@entries << (code..(code + num_left))
163-
@count += num_left + 1
165+
@items_count += num_left + 1
164166
end
165167

166168
when :supplemental
167169
@entries = {}
168-
@count = entry_count
170+
@items_count = entry_count
169171

170172
entry_count.times do
171173
code, glyph = read(element_width, element_format)

lib/ttfunk/table/cff/fd_selector.rb

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ class FdSelector < TTFunk::SubTable
1212
RANGE_ENTRY_SIZE = 3
1313
ARRAY_ENTRY_SIZE = 1
1414

15-
attr_reader :top_dict, :count, :entries, :n_glyphs
15+
attr_reader :top_dict, :items_count, :entries, :n_glyphs
1616

1717
def initialize(top_dict, file, offset, length = nil)
1818
@top_dict = top_dict
@@ -48,16 +48,16 @@ def [](glyph_id)
4848
def each
4949
return to_enum(__method__) unless block_given?
5050

51-
count.times { |i| yield self[i] }
51+
items_count.times { |i| yield self[i] }
5252
end
5353

54-
# mapping is new -> old glyph ids
55-
def encode(mapping)
54+
def encode(charmap)
5655
# get list of [new_gid, fd_index] pairs
5756
new_indices =
58-
mapping.keys.sort.map do |new_gid|
59-
[new_gid, self[mapping[new_gid]]]
60-
end
57+
charmap
58+
.reject { |code, mapping| mapping[:new].zero? && !code.zero? }
59+
.sort_by { |_code, mapping| mapping[:new] }
60+
.map { |(_code, mapping)| [mapping[:new], self[mapping[:old]]] }
6161

6262
ranges = rangify_gids(new_indices)
6363
total_range_size = ranges.size * RANGE_ENTRY_SIZE
@@ -108,10 +108,10 @@ def parse!
108108

109109
case format_sym
110110
when :array_format
111-
@n_glyphs = top_dict.charstrings_index.count
111+
@n_glyphs = top_dict.charstrings_index.items_count
112112
data = io.read(n_glyphs)
113113
@length += data.bytesize
114-
@count = data.bytesize
114+
@items_count = data.bytesize
115115
@entries = data.bytes
116116

117117
when :range_format
@@ -135,7 +135,7 @@ def parse!
135135
last_start_gid, last_fd_index = ranges.last
136136
@entries << [(last_start_gid...(n_glyphs + 1)), last_fd_index]
137137

138-
@count = entries.reduce(0) { |sum, entry| sum + entry.first.size }
138+
@items_count = entries.reduce(0) { |sum, entry| sum + entry.first.size }
139139
end
140140
end
141141

0 commit comments

Comments
 (0)