Skip to content

Commit 71a74ad

Browse files
authored
Preserve encoding on truncate_bytes (rails#51313)
String.new with no arguments returns the empty string with ASCII-8BIT encoding. Then, depending on each grapheme cluster of the string and on the omission string, the resulting string might keep the ASCII-8BIT encoding. With this change, we preserve the encoding of the original string instead. Note that String.new accepts an `encoding` keyword argument, like ``` String.new(encoding: Encoding::UTF_8) ``` However, instead of using that, we rely on `force_encoding` to set the original encoding. This is so that String subclasses don't need to preserve this keyword argument. For example, SafeBuffer doesn't. Thanks to @jeremy for catching this!
1 parent d79da7b commit 71a74ad

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

activesupport/lib/active_support/core_ext/string/filters.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ def truncate_bytes(truncate_to, omission: "…")
109109
when omission.bytesize == truncate_to
110110
omission.dup
111111
else
112-
self.class.new.tap do |cut|
112+
self.class.new.force_encoding(encoding).tap do |cut|
113113
cut_at = truncate_to - omission.bytesize
114114

115115
each_grapheme_cluster do |grapheme|

activesupport/test/core_ext/string_ext_test.rb

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -376,6 +376,15 @@ def test_truncates_bytes_preserves_grapheme_clusters
376376
assert_equal "", "👩‍❤️‍👩".truncate_bytes(13, omission: nil)
377377
end
378378

379+
def test_truncates_bytes_preserves_encoding
380+
original = String.new("a" * 30, encoding: Encoding::UTF_8)
381+
382+
assert_equal Encoding::UTF_8, original.truncate_bytes(15).encoding
383+
assert_equal Encoding::UTF_8, original.truncate_bytes(15, omission: nil).encoding
384+
assert_equal Encoding::UTF_8, original.truncate_bytes(15, omission: " ").encoding
385+
assert_equal Encoding::UTF_8, original.truncate_bytes(15, omission: "🖖").encoding
386+
end
387+
379388
def test_truncate_words
380389
assert_equal "Hello Big World!", "Hello Big World!".truncate_words(3)
381390
assert_equal "Hello Big...", "Hello Big World!".truncate_words(2)

0 commit comments

Comments
 (0)