-
Notifications
You must be signed in to change notification settings - Fork 601
Explicitly reject length(NAME) with typemaps other than T_PV #23479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This one caused my a bunch of grief related to #23150 before I figured out what was going on. |
|
It needs test(s) for the new error message. Other than that, I approve. |
Currently, in dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS/Node.pm's Could we write tests for both conditions in this p.r.? |
bdf5a3f to
e214089
Compare
Added that |
|
On Sun, Jul 27, 2025 at 03:31:09AM -0700, James E Keenan wrote:
jkeenan left a comment (Perl/perl5#23479)
Currently, in dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS/Node.pm's `lookup_input_typemap()`, there already exists one error condition at the point where the p.r. proposes to insert a new `die`. That existing `die` itself appears to be unexercised in the test suite.
```
That's because that condition isn't (I think) ever reached. A similar
check+die is done earlier during parsing, and there *is* a test for that
error. I'll add a note to self to remove the redundant check at some
point.
…--
Modern art:
"That's easy, I could have done that!"
"Ah, but you didn't!"
|
e214089 to
e1d0d2e
Compare
Previously, the length operator on typemaps other than T_PV would lead to that length value not being initialized, leading to segfaults and worse. Worse yet, parsexs would silently emit this erroneous code. For now it will at least give a clear error, in the future we should perhaps consider eliminating this limitation altogether.
e1d0d2e to
cd1bf15
Compare
|
Apparently this broke HarfBuzz-Shaper. It has its own typemap working around the uninitialized issue. I guess this is another argument in favor of utf8/byte markers on string arguments. |
@Leont could you please open a new BBC ticket so that we can track this? Thanks. |
|
On Wed, Sep 10, 2025 at 10:04:37AM -0700, Leon Timmermans wrote:
Leont left a comment (Perl/perl5#23479)
Apparently this broke [HarfBuzz-Shaper](sciurius/perl-HarfBuzz-Shaper#10). It has its own [typemap](https://github.com/sciurius/perl-HarfBuzz-Shaper/blob/3b95bc6854fef50d6d9082ca3679022b00bbf998/typemap#L17) working around the uninitialized issue.
I guess this is another argument in favor of utf8/byte markers on string arguments.
The length(s) declaration prior to your T_PV change had two main effects:
1. Declare STRLEN_length_of_s, XSauto_length_of_s,
assign the value of STRLEN_length_of_s to XSauto_length_of_s
(thus bypassing any typcasing issues)
and pass XSauto_length_of_s as an argument to the wrapped function.
Note that STRLEN_length_of_s doesn't get assigned a value.
2. In addition, if the type of s mapped to T_PV, then:
a) override the typemap for s, and use
($type)SvPV($arg, STRLEN_length_of_$var);
instead (so STRLEN_length_of_$var gets assigned to)
b) complain if s has a default value
After your change, s mapping to other than T_PV became an error.
Regardless of whether it should have become an error, I think the code was
wrong: the 'default' test should die any time length(s) is used, not just
when its T_PV.
I think HarfBuzz-Shaper is doing a lot of unwarranted chumminess with the
XS compiler, given that both STRLEN_length_of_$var and
XSauto_length_of_$var are undocumented implementation details. (NB: in my
current perlxs.pod draft, I document XSauto_length_of_$var but not
STRLEN_length_of_$var.)
Perhaps we should refine the code to;
if (has_length) {
unless ($typemap =~ /STRLEN_length_of_\$var/) {
die unless $xstype eq 'T_PV';
$typemap = '($type)SvPV($arg, STRLEN_length_of_$var);'
}
die if $default;
}
Which boils down to to:
either supply the code which initialises STRLEN_length_of_s,
or we'll supply it for you - but we only know how to do that with
T_PV.
|
I'm not following you there.
Yeah that sounds sensible. |
|
On Thu, Sep 11, 2025 at 07:37:05AM -0700, Leon Timmermans wrote:
> Regardless of whether it should have become an error, I think the code
> was wrong: the 'default' test should die any time length(s) is used,
> not just when its T_PV.
I'm not following you there.
Sorry, I meant that even if your 'die unless T_PV' patch was backed out,
the existing code should be modified so that the 'die if $default' line
is executed *even* if not T_PV. So rather than
if (has_length and $xstype eq 'T_PV') {
die "can't have default value\n" if defined $default;
... override typemap ...
}
it should become:
if (has_length) {
die "can't have default value\n" if defined $default;
if ($xstype eq 'T_PV') {
... override typemap ...
}
}
|
|
On Thu, Sep 11, 2025 at 04:24:01AM -0700, David Mitchell wrote:
Which boils down to to:
either supply the code which initialises STRLEN_length_of_s,
or we'll supply it for you - but we only know how to do that with
T_PV.
Thinking further. We shouldn't rely on testing the XS type to be T_PV at
all: after all, the typemap template for T_PV may have been overridden.
Instead we should rely on what the template *looks like*. So the code
should (in principle) look like:
if (template contains STRLEN_length_of_$var) {
declare and initialise $var using template;
(we assume the template will have initialised STRLEN_length_of_$var)
}
else if (template looks like SvPV_nolen($arg)) {
replace it with SvPV($arg, STRLEN_length_of_$var);
declare and initialise $var using template;
}
else {
declare and initialise $var using template;
emit: STRLEN_length_of_$var = SvCUR($arg);
}
Or to put it another way:
In general, we initialise the STRLEN_length_of_s variable using SvCUR();
but if possible, we optimise that by setting it using a modified template
at the same time as getting the value of s (i.e. what the old T_PV test
used to do), or if the template already seems to set STRLEN_length_of_s
itself, don't bother.
|
|
|
On Sat, Sep 13, 2025 at 04:11:18PM -0700, Leon Timmermans wrote:
Leont left a comment (Perl/perl5#23479)
> In general, we initialise the STRLEN_length_of_s variable using SvCUR()
`SvCUR` is only defined if the value in `SvPOK`, so I guess it'd have to be `sv_len`, but otherwise that sounds fine to me.
Ok.
So how about we revert your 'croak on T_PV' change immediately to fix the
CPAN breakage, then I'll add this thread to my list of things to change in
XS for 5.44 ?
|
Are we tracking CPAN breakage from that change? My mind is drawing a blank here. If so, which ticket? Thanks. |
|
On Fri, Sep 12, 2025 at 02:04:31AM -0700, David Mitchell wrote:
Thinking further. We shouldn't rely on testing the XS type to be T_PV at
all: after all, the typemap template for T_PV may have been overridden.
Instead we should rely on what the template *looks like*. So the code
should (in principle) look like:
if (template contains STRLEN_length_of_$var) {
declare and initialise $var using template;
(we assume the template will have initialised STRLEN_length_of_$var)
}
else if (template looks like SvPV_nolen($arg)) {
replace it with SvPV($arg, STRLEN_length_of_$var);
declare and initialise $var using template;
}
else {
declare and initialise $var using template;
emit: STRLEN_length_of_$var = SvCUR($arg);
}
and Leon then pointed out that the SvCUR() should be sv_len().
But thinking further: the string pointer and length must be determined at
the *same* time; otherwise there are a few edge cases - overload springs
to mind - where the string buffer is set by a first call to "" overloading
and the length from a second call, which may not return the same buffer
nor length each time.
So I now propose:
if (template contains STRLEN_length_of_$var) {
declare and initialise $var using template;
(we assume the template will have initialised STRLEN_length_of_$var)
}
else if (template looks like SvPV_nolen($arg)) {
replace it with SvPV($arg, STRLEN_length_of_$var);
declare and initialise $var using template;
}
else {
croak("Can't determine length from template $foo");
}
That new croak is serving the same purpose as the recently reverted
"croak unless T_PV", but provides more flexibility.
|
Previously, the length operator on typemaps other than T_PV would lead to that length value not being initialized, leading to segfaults and worse. Worse yet, ExtUtils::ParseXS would silently emit this erroneous code.
For now it will at least give a clear error, in the future we should perhaps consider eliminating this limitation altogether.