Skip to content

Commit 51f3fa6

Browse files
authored
Merge pull request rails#53655 from martinemde/martinemde/byteslice-erb-tokenize
Fix multibyte character tokenization bug in ERB::Util
2 parents c775f62 + 30010bb commit 51f3fa6

File tree

3 files changed

+24
-2
lines changed

3 files changed

+24
-2
lines changed

activesupport/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
* Fix a bug in `ERB::Util.tokenize` that causes incorrect tokenization when ERB tags are preceeded by multibyte characters.
2+
3+
*Martin Emde*
4+
15
* Add `ActiveSupport::Testing::NotificationAssertions` module to help with testing `ActiveSupport::Notifications`.
26

37
*Nicholas La Roux*, *Yishu See*, *Sean Doyle*

activesupport/lib/active_support/core_ext/erb/util.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,7 @@ def self.tokenize(source) # :nodoc:
174174

175175
case source.matched
176176
when start_re
177-
tokens << [:TEXT, source.string[pos, len]] if len > 0
177+
tokens << [:TEXT, source.string.byteslice(pos, len)] if len > 0
178178
tokens << [:OPEN, source.matched]
179179
if source.scan(/(.*?)(?=#{finish_re}|\z)/m)
180180
tokens << [:CODE, source.matched] unless source.matched.empty?
@@ -183,7 +183,7 @@ def self.tokenize(source) # :nodoc:
183183
raise NotImplementedError
184184
end
185185
when finish_re
186-
tokens << [:CODE, source.string[pos, len]] if len > 0
186+
tokens << [:CODE, source.string.byteslice(pos, len)] if len > 0
187187
tokens << [:CLOSE, source.matched]
188188
else
189189
raise NotImplementedError, source.matched

activesupport/test/core_ext/erb_util_test.rb

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,24 @@ def test_text_end
135135
], actual_tokens
136136
end
137137

138+
def test_multibyte_characters_start
139+
source = "こんにちは<%= name %>"
140+
actual_tokens = tokenize source
141+
assert_equal [[:TEXT, "こんにちは"],
142+
[:OPEN, "<%="],
143+
[:CODE, " name "],
144+
[:CLOSE, "%>"],
145+
], actual_tokens
146+
end
147+
148+
def test_multibyte_characters_end
149+
source = " 'こんにちは' %>"
150+
actual_tokens = tokenize source
151+
assert_equal [[:CODE, " 'こんにちは' "],
152+
[:CLOSE, "%>"],
153+
], actual_tokens
154+
end
155+
138156
def tokenize(source)
139157
ERB::Util.tokenize source
140158
end

0 commit comments

Comments
 (0)