Skip to content

Commit b9a8790

Browse files
committed
utf8_to_uv_msgs: Clarify warning message
Previously, this output an error message consisting of two characters while calling it a single malformed character. And croak.t tested for that. But that message is misleading.
1 parent 2001bed commit b9a8790

File tree

2 files changed

+11
-3
lines changed

2 files changed

+11
-3
lines changed

t/lib/croak/toke_l1

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,5 +18,5 @@ Malformed UTF-8 character (fatal) at - line 2.
1818
use utf8;y'0�''
1919
EXPECT
2020
Malformed UTF-8 character: \xc1\x27 (unexpected non-continuation byte 0x27, immediately after start byte 0xc1; need 2 bytes, got 1) at - line 1.
21-
Malformed UTF-8 character: \xc1\x27 (any UTF-8 sequence that starts with "\xc1" is overlong which can and should be represented with a different, shorter sequence) at - line 1.
21+
Malformed UTF-8 character: \xc1 (any UTF-8 sequence that starts with "\xc1" is overlong which can and should be represented with a different, shorter sequence) at - line 1.
2222
Malformed UTF-8 character (fatal) at - line 1.

utf8.c

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1978,6 +1978,11 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
19781978
* end, based on how many bytes the start byte tells
19791979
* us should be in it, but no further than s0 +
19801980
* avail_len
1981+
* overlong_detect_length if no overlong malformation is present, this is
1982+
* 0; otherwise it is the number of bytes required to
1983+
* make that determination. It is used below to limit
1984+
* the number of bytes displayed in a warning so as to
1985+
* make the warning accurate and not misleading.
19811986
*/
19821987
bool success = true;
19831988

@@ -2296,8 +2301,11 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
22962301
" \"%s\" is overlong which can and should be"
22972302
" represented with a different, shorter sequence)",
22982303
malformed_text,
2299-
byte_dump_string_(s0, send - s0, 0),
2300-
byte_dump_string_(s0, curlen, 0));
2304+
byte_dump_string_(s0, curlen, 0),
2305+
byte_dump_string_(s0,
2306+
MIN(avail_len,
2307+
overlong_detect_length),
2308+
0));
23012309
}
23022310
else {
23032311
U8 tmpbuf[UTF8_MAXBYTES+1];

0 commit comments

Comments
 (0)