Skip to content

Commit d781d69

Browse files
committed
Fix prism error messages with multibyte truncation
When a line is going to be displayed in an error message that contains multibyte characters, we need to respect the encoding of the source and truncate only at a character boundary, as opposed to a raw byte boundary. Fixes [Bug #21528]
1 parent 4f4b4e3 commit d781d69

File tree

1 file changed

+20
-1
lines changed

1 file changed

+20
-1
lines changed

prism_compile.c

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10627,7 +10627,26 @@ pm_parse_errors_format_line(const pm_parser_t *parser, const pm_newline_list_t *
1062710627
// Here we determine if we should truncate the end of the line.
1062810628
bool truncate_end = false;
1062910629
if ((column_end != 0) && ((end - (start + column_end)) >= PM_ERROR_TRUNCATE)) {
10630-
end = start + column_end + PM_ERROR_TRUNCATE;
10630+
const uint8_t *end_candidate = start + column_end + PM_ERROR_TRUNCATE;
10631+
10632+
for (const uint8_t *ptr = start; ptr < end_candidate;) {
10633+
size_t char_width = parser->encoding->char_width(ptr, parser->end - ptr);
10634+
10635+
// If we failed to decode a character, then just bail out and
10636+
// truncate at the fixed width.
10637+
if (char_width == 0) break;
10638+
10639+
// If this next character would go past the end candidate,
10640+
// then we need to truncate before it.
10641+
if (ptr + char_width > end_candidate) {
10642+
end_candidate = ptr;
10643+
break;
10644+
}
10645+
10646+
ptr += char_width;
10647+
}
10648+
10649+
end = end_candidate;
1063110650
truncate_end = true;
1063210651
}
1063310652

0 commit comments

Comments
 (0)