Skip to content

Commit 8543a7a

Browse files
committed
Add valid_utf8_to_uv()
This is identical to valid_utf8_to_uvchr(). They are both internal functions designed for when you are certain that the utf8 string to be translated is well formed; generally you created it yourself earlier. The only reason for this new synonym is to lessen the cognitive load on programmers who should be using the "_uv" suffix functions, and not the "_uvchr" suffix ones for these sorts of tasks. By having this synonym, one doesn't have to learn that there are two.
1 parent 738383d commit 8543a7a

File tree

6 files changed

+36
-9
lines changed

6 files changed

+36
-9
lines changed

embed.fnc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3906,7 +3906,10 @@ Adp |bool |valid_identifier_pvn \
39063906
|U32 flags
39073907
Adp |bool |valid_identifier_sv \
39083908
|NULLOK SV *sv
3909-
CRTdip |UV |valid_utf8_to_uvchr \
3909+
CRTdip |UV |valid_utf8_to_uv \
3910+
|NN const U8 *s \
3911+
|NULLOK STRLEN *retlen
3912+
CRTdmp |UV |valid_utf8_to_uvchr \
39103913
|NN const U8 *s \
39113914
|NULLOK STRLEN *retlen
39123915
Adp |int |vcmp |NN SV *lhv \

embed.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -841,7 +841,8 @@
841841
# define valid_identifier_pve(a,b,c) Perl_valid_identifier_pve(aTHX_ a,b,c)
842842
# define valid_identifier_pvn(a,b,c) Perl_valid_identifier_pvn(aTHX_ a,b,c)
843843
# define valid_identifier_sv(a) Perl_valid_identifier_sv(aTHX_ a)
844-
# define valid_utf8_to_uvchr Perl_valid_utf8_to_uvchr
844+
# define valid_utf8_to_uv Perl_valid_utf8_to_uv
845+
# define Perl_valid_utf8_to_uvchr valid_utf8_to_uvchr
845846
# define vcmp(a,b) Perl_vcmp(aTHX_ a,b)
846847
# define vcroak(a,b) Perl_vcroak(aTHX_ a,b)
847848
# define vdeb(a,b) Perl_vdeb(aTHX_ a,b)

inline.h

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1306,25 +1306,36 @@ Perl_utf8_to_bytes_overwrite(pTHX_ U8 **s_ptr, STRLEN *lenp)
13061306
}
13071307

13081308
/*
1309-
=for apidoc valid_utf8_to_uvchr
1310-
Like C<L<perlapi/utf8_to_uv>>, but should only be called when it is
1309+
=for apidoc valid_utf8_to_uv
1310+
=for apidoc_item valid_utf8_to_uvchr
1311+
1312+
These are synonymous.
1313+
1314+
These are like C<L<perlapi/utf8_to_uv>>, but should only be called when it is
13111315
known that the next character in the input UTF-8 string C<s> is well-formed
13121316
(I<e.g.>, it passes C<L<perlapi/isUTF8_CHAR>>. Surrogates, non-character code
13131317
points, and non-Unicode code points are allowed.
13141318
1319+
The only use for these is that they should run slightly faster than
1320+
C<utf8_to_uv> because no error checking is done.
1321+
1322+
The C<_uv> form is slightly preferred so as to have a consistent spelling with
1323+
the other C<_uv> forms that are definitely preferred over the older and
1324+
problematic C<_uvchr> forms.
1325+
13151326
=cut
13161327
13171328
*/
13181329

13191330
PERL_STATIC_INLINE UV
1320-
Perl_valid_utf8_to_uvchr(const U8 *s, STRLEN *retlen)
1331+
Perl_valid_utf8_to_uv(const U8 *s, STRLEN *retlen)
13211332
{
1333+
PERL_ARGS_ASSERT_VALID_UTF8_TO_UV;
1334+
13221335
const UV expectlen = UTF8SKIP(s);
13231336
const U8* send = s + expectlen;
13241337
UV uv = *s;
13251338

1326-
PERL_ARGS_ASSERT_VALID_UTF8_TO_UVCHR;
1327-
13281339
if (retlen) {
13291340
*retlen = expectlen;
13301341
}

pod/perldelta.pod

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,13 @@ well.
350350

351351
XXX
352352

353+
=item *
354+
355+
A new function C<valid_utf8_to_uv> has been added. This is synonymous
356+
with C<valid_utf8_to_uvchr>; its reason for existence is to have
357+
consistent spelling with the names of the other functions that translate
358+
from UTF-8, so you don't have to remember a different spelling.
359+
353360
=back
354361

355362
=head1 Selected Bug Fixes

proto.h

Lines changed: 6 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

utf8.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,7 @@ For details, see the description for L<perlapi/uv_to_utf8_flags>.
191191
#define c9strict_utf8_to_uv(s, e, cp_p, advance_p) \
192192
utf8_to_uv_flags( s, e, cp_p, advance_p, \
193193
UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE)
194+
#define valid_utf8_to_uvchr(s, advance_p) valid_utf8_to_uv(s, advance_p)
194195

195196
#define utf16_to_utf8(p, d, bytelen, newlen) \
196197
utf16_to_utf8_base(p, d, bytelen, newlen, 0, 1)

0 commit comments

Comments
 (0)