Skip to content

Commit d4d40c2

Browse files
committed
check_utf8_print: Use utf8_to_uv_flags.
This replaces the old-style utf8n_to_uvchr()
1 parent c2ae512 commit d4d40c2

File tree

1 file changed

+15
-9
lines changed

1 file changed

+15
-9
lines changed

utf8.c

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1246,12 +1246,14 @@ C<utf8n_to_uvchr> is more like an extension of C<utf8_to_uvchr_buf>, but
12461246
with fewer quirks, and a different method of specifying the bytes in C<s> it is
12471247
allowed to examine. It has a C<curlen> parameter instead of an C<e> parameter,
12481248
so the furthest byte in C<s> it can look at is S<C<s + curlen - 1>>. Its
1249-
return value is, like C<utf8_to_uvchr_buf>, ambiguous with respect to the NUL
1250-
and REPLACEMENT characters, but the value of C<*retlen> can be relied on
1251-
(except with the C<UTF8_CHECK_ONLY> flag described below) to know where the
1252-
next possible character along C<s> starts, removing that quirk. Hence, you
1253-
always should use C<*retlen> to determine where the next character in C<s>
1254-
starts.
1249+
failure return value is not dependent on if warnings are enabled or not. It is
1250+
always 0 upon failure. But like C<utf8_to_uvchr_buf>, 0 could also be the
1251+
return for a successful translation of an input C<NUL> character. Use the same
1252+
method given above for disambiguating this. Unlike C<utf8_to_uvchr_buf>,
1253+
C<*retlen> can be relied on (except with the C<UTF8_CHECK_ONLY> flag described
1254+
below) to know where the next possible character along C<s> starts, removing
1255+
that quirk. Hence, you always should use C<*retlen> to determine where the
1256+
next character in C<s> starts.
12551257
12561258
These functions have an additional parameter, C<flags>, besides the ones in
12571259
C<utf8_to_uv> and C<utf8_to_uvchr_buf>, which can be used to broaden or
@@ -2080,7 +2082,7 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
20802082
switch (this_problem) {
20812083
default:
20822084
Perl_croak(aTHX_ "panic: Unexpected case value in "
2083-
" utf8n_to_uvchr_msgs() %" U32uf,
2085+
" utf8_to_uv_msgs() %" U32uf,
20842086
this_problem);
20852087
/* NOTREACHED */
20862088
break;
@@ -4687,8 +4689,10 @@ Perl_check_utf8_print(pTHX_ const U8* s, const STRLEN len)
46874689
if ( ckWARN_d(WARN_NON_UNICODE)
46884690
|| UNLIKELY(does_utf8_overflow(s, s + len) >= ALMOST_CERTAINLY_OVERFLOWS))
46894691
{
4692+
UV dummy;
4693+
46904694
/* A side effect of this function will be to warn */
4691-
(void) utf8n_to_uvchr(s, e - s, NULL, UTF8_WARN_SUPER);
4695+
(void) utf8_to_uv_flags(s, e, &dummy, NULL, UTF8_WARN_SUPER);
46924696
ok = FALSE;
46934697
}
46944698
}
@@ -4707,8 +4711,10 @@ Perl_check_utf8_print(pTHX_ const U8* s, const STRLEN len)
47074711
else if ( UNLIKELY(UTF8_IS_NONCHAR(s, e))
47084712
&& (ckWARN_d(WARN_NONCHAR)))
47094713
{
4714+
UV dummy;
4715+
47104716
/* A side effect of this function will be to warn */
4711-
(void) utf8n_to_uvchr(s, e - s, NULL, UTF8_WARN_NONCHAR);
4717+
(void) utf8_to_uv_flags(s, e, &dummy, NULL, UTF8_WARN_NONCHAR);
47124718
ok = FALSE;
47134719
}
47144720
}

0 commit comments

Comments
 (0)