Skip to content

Commit a3b31bf

Browse files
committed
utf8n_to_uv_msgs: Avoid unnecessary work
The outermost block here is executed when any of three types of problematic Unicode code points is encountered, and the caller has indicated special handling of at least one of those types. Before this commit, we set a flag to later look to see if what was encountered matched the type the caller specified. This commit changes to do that looking at the point where the flag had been set, and only sets the flag if necessary. This may completely avoid the later work, which has set-up overhead, and this will make future commits simpler.
1 parent e6954d8 commit a3b31bf

File tree

1 file changed

+22
-3
lines changed

1 file changed

+22
-3
lines changed

utf8.c

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1858,13 +1858,19 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
18581858
* loop just above. */
18591859
if (isUNICODE_POSSIBLY_PROBLEMATIC(uv)) {
18601860
if (UNLIKELY(UNICODE_IS_SURROGATE(uv))) {
1861-
possible_problems |= UTF8_GOT_SURROGATE;
1861+
if (flags & (UTF8_DISALLOW_SURROGATE|UTF8_WARN_SURROGATE)) {
1862+
possible_problems |= UTF8_GOT_SURROGATE;
1863+
}
18621864
}
18631865
else if (UNLIKELY(UNICODE_IS_SUPER(uv))) {
1864-
possible_problems |= UTF8_GOT_SUPER;
1866+
if (flags & (UTF8_DISALLOW_SUPER|UTF8_WARN_SUPER)) {
1867+
possible_problems |= UTF8_GOT_SUPER;
1868+
}
18651869
}
18661870
else if (UNLIKELY(UNICODE_IS_NONCHAR(uv))) {
1867-
possible_problems |= UTF8_GOT_NONCHAR;
1871+
if (flags & (UTF8_DISALLOW_NONCHAR|UTF8_WARN_NONCHAR)) {
1872+
possible_problems |= UTF8_GOT_NONCHAR;
1873+
}
18681874
}
18691875
}
18701876
}
@@ -2039,6 +2045,10 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
20392045

20402046
case UTF8_GOT_SURROGATE:
20412047

2048+
/* Code earlier in this function has set things up so we don't
2049+
* get here unless at least one of the two top-level 'if's in
2050+
* this case are true */
2051+
20422052
if (flags & UTF8_WARN_SURROGATE) {
20432053
*errors |= UTF8_GOT_SURROGATE;
20442054

@@ -2068,6 +2078,10 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
20682078

20692079
case UTF8_GOT_NONCHAR:
20702080

2081+
/* Code earlier in this function has set things up so we don't
2082+
* get here unless at least one of the two top-level 'if's in
2083+
* this case are true */
2084+
20712085
if (flags & UTF8_WARN_NONCHAR) {
20722086
*errors |= UTF8_GOT_NONCHAR;
20732087

@@ -2202,6 +2216,11 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
22022216

22032217
case UTF8_GOT_SUPER:
22042218

2219+
/* We get here when the input is for an above Unicode code
2220+
* point, but it does not use Perl extended UTF-8, and the
2221+
* caller has indicated that these are to be disallowed and/or
2222+
* warned about */
2223+
22052224
if (flags & UTF8_WARN_SUPER) {
22062225
*errors |= UTF8_GOT_SUPER;
22072226

0 commit comments

Comments
 (0)