Skip to content

Commit a103566

Browse files
authored
Merge pull request #57 from leonerd/undef-aware-equality
Undef aware equality
2 parents e1b5b07 + 6ade0cc commit a103566

File tree

2 files changed

+286
-0
lines changed

2 files changed

+286
-0
lines changed

ppcs/ppcTODO-metaoperator-flags.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Meta-operator Flags to Modify Behaviour of Operators
2+
3+
## Preamble
4+
5+
Author: Paul Evans <PEVANS>
6+
Sponsor:
7+
ID: TODO
8+
Status: Exploratory
9+
10+
## Abstract
11+
12+
Defines a new syntax for adding behaviour-modifying flags to operators, in order to extend the available set of behaviours.
13+
14+
## Motivation
15+
16+
This idea came out of discussions around the proposal to add new `equ` and `===` operators, which are aware of the special nature of the `undef` value. The idea is that rather than adding just two new special-purpose operators, a far more flexible and extensible idea may be to allow syntax for putting flags on the existing operators.
17+
18+
The central idea here is that existing operators can be modified by a set of single-letter suffix flags, that could be considered slightly similar to the concept of regexp pattern flags.
19+
20+
## Rationale
21+
22+
(explain why the (following) proposed solution will solve it)
23+
24+
## Specification
25+
26+
Following the name of a regular value-returning infix operator, an optional set of flag letters can be supplied after a separating colon:
27+
28+
```perl
29+
$x eq:u $y
30+
```
31+
32+
In terms of syntax, these are flags that alter the runtime behaviour of the operator, and do not change the way it parses at compile-time. The particular behaviour being changed would depend on the operator and the flag applied, but in order to make it easy to learn and use, some consistency should be applied to the choice of flags.
33+
34+
The `:u` flag alters the behaviour of a scalar-comparing operator (`eq`, `ne`, `==`, `!=`) in the same way described in the `equ` proposal; namely, that these operators now consider that `undef` is equal to another `undef` but unequal to any defined value, even the empty string or number zero.
35+
36+
```perl
37+
$x eq:u $y
38+
39+
# Equivalent to (!defined $x and !defined $y) or
40+
# (defined $x and defined $y and $x eq $y)
41+
42+
$x ==:u $y
43+
44+
# Equivalent to (!defined $x and !defined $y) or
45+
# (defined $x and defined $y and $x == $y)
46+
```
47+
48+
The `ne` and `!=` operators can likewise be modified to produce the negation of these ones.
49+
50+
```perl
51+
$x ne:u $y
52+
53+
# Equivalent to not( $x eq:u $y )
54+
# i.e. (defined $x xor defined $y) or
55+
# (defined $x and defined $y and $x ne $y)
56+
57+
$x !=:u $y
58+
59+
# Equivalent to not( $x ==:u $y )
60+
# i.e. (defined $x xor defined $y) or
61+
# (defined $x and defined $y and $x != $y)
62+
```
63+
64+
Another flag, `:i`, modifies the string comparison operator making it case-insensitive, acting as if its operands were first passed through the `fc()` function:
65+
66+
```perl
67+
$x eq:i $y
68+
69+
# Equivalent to fc($x) eq fc($y)
70+
```
71+
72+
There is no equivalent on a numerical comparison; attempting to apply the flag here would result in a compile-time error.
73+
74+
Finally, note that these two flags can be combined, and in that case the order does not matter.
75+
76+
```perl
77+
$x eq:ui $y
78+
$x eq:iu $y
79+
80+
# Equivalent to (!defined $x and !defined $y) or
81+
# (defined $x and defined $y and fc($x) eq fc($y))
82+
```
83+
84+
## Backwards Compatibility
85+
86+
As these notations are currently syntax errors in existing versions of Perl, it is not required that they be feature-guarded.
87+
88+
In particular, there is no potential that a colon-prefixed identifier name is confused with a statement label, because a statement label must appear at the beginning of a statement, whereas an infix operator must appear after its left-hand operand:
89+
90+
```perl
91+
use v5.xx;
92+
93+
eq:u $y; # A call to the u($x) function, in a statement labeled as 'eq:'
94+
95+
$x eq:u $y; # An expression using the eq operator modified with :u
96+
```
97+
98+
Similarly, there is no ambiguity with the ternary conditional operator pair `? :`, because the colon for these flags only appears when the right-hand operand to an infix operator is expected:
99+
100+
```perl
101+
use v5.xx;
102+
103+
my $result = $stringy ? $x eq:u $y : $x ==:u $y;
104+
# Parser always knows these ^------------^ are meta-operator flags
105+
```
106+
107+
Finally, there is no ambiguity with attribute syntax - even when considering extension
108+
proposals that would extend the attribute syntax to more sites - again because of the non-overlap between applicable situations. In fact it could be argued that these flags are a similar and related syntax; appearing to apply some sort of adverb-like "attribute" to an operator. Where full attributes have identifiers as names and can take optional values, these flags are bundled into single-letter names without options.
109+
110+
## Security Implications
111+
112+
No new issues are anticipated.
113+
114+
## Examples
115+
116+
A few examples are given in the specification above.
117+
118+
## Prototype Implementation
119+
120+
It is currently not thought possible to prototype this exact syntax as a CPAN module, because it requires extending the parser in ways that neither the `PL_keyword_plugin` nor `PL_infix_plugin` extension mechanisms can achieve.
121+
122+
If it were considered necessary to prototype these on CPAN in some manner, it would be possible to define some new operator names, such as `eqx` for "extensible `eq`", which would then allow such suffix flag notation on them. While it would then be possible to experiment with possible behaviours of new flag letters, it would remove most of the advantage of the idea in that it uses (and applies to) existing operator names, rather than inventing new ones. It is therefore unlikely to be worth performing such an experiment.
123+
124+
## Future Scope
125+
126+
As this proposal introduces the idea of adding flags named by letters to existing Perl operators, there is clearly much more potential scope to define other modifier letters to these or other operators.
127+
128+
However, care should definitely be taken to limit these new definitions to genuinely useful combinations that could not easily be achieved by other means. With over 30 infix operators and 52 single-letter flags available, the number of possible ideas far exceeds the number of combinations that would actually be useful in real-world situations, and sufficiently motivating to justify adding more things for users to learn and recognise.
129+
130+
In particular, care should be taken that any new flag proposals do not attempt to reuse existing flag letters to have other meanings. For example, if the `:i` flag is added to other operators its meaning should be somehow compatible with the "ignore-case" of `eq`, rather than serve some other unrelated purpose.
131+
132+
## Rejected Ideas
133+
134+
### Using the `/` symbol as a flag separator
135+
136+
It may look similar to the regexp pattern flags and suggest a similar purpose, but it becomes more of a problem to suffix attach these onto existing numerical operators. While not suggested by this document, consider the hypothetical case of adding a flag to the division operator such as `z` to alter how it behaves on division-by-zero. The `:` symbol allows this as `/:z` whereas the `/` symbol leads to the problematic `//z` operator. There are no existing infix operators that use the `:` symbol.
137+
138+
## Open Issues
139+
140+
### Is this use of the `:` symbol still too much?
141+
142+
Considered alone in this proposal, the syntax is unambiguous and makes sense with the existing Perl syntax. In particular, the similarity with attributes is noted. However, when considering such other possible ideas as the `in:OP` hyper-operator, or the `match/case` syntax to replace the problematic `given/when` and smartmatch, there seems to be an explosion in the possible meanings of the colon character:
143+
144+
```perl
145+
if( $str in:eq:u @possible_matches ) { ... }
146+
# ^--^ these two colons mean conceptually different things
147+
```
148+
149+
```perl
150+
match( $value : eq:u ) {
151+
# ^---^ these two colons also mean conceptually different things
152+
case(undef) { say "It was undefined" }
153+
case("") { say "It was empty string" }
154+
default { say "It was a non-empty string" }
155+
}
156+
```
157+
158+
In each case here the Perl parser (or another other static analysis tooling, such as syntax highlighters) should have no trouble understanding the various meanings. However, it may cause some confusion to human readers as to what the different uses all mean. I have tried to be consistent with the use of whitespace before or after the colon symbol to hint at its various different meanings in the examples above. These are currently purely informational hints to human readers, and not considered significant by the Perl parser.
159+
160+
There aren't many other viable choices of symbol, at least not while remaining both within the character set provided by ASCII, and not being sensitive to the presence of whitespace. This may be the inevitable pressure of trying to add more features to such an operator-rich language as Perl, while attempting to stick to ASCII and whitespace-agnostic syntax. At some point we may have to accept that we cannot add new operator-based syntax without either accepting the use of non-ASCII Unicode operators, or using presence-of-whitespace hints to disambiguate different cases.
161+
162+
## Copyright
163+
164+
Copyright (C) 2024, Paul Evans.
165+
166+
This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself.

ppcs/ppcTODO-undef-aware-equality.md

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# `undef`-aware Equality Operators
2+
3+
## Preamble
4+
5+
Author: Paul Evans <PEVANS>
6+
Sponsor:
7+
ID: TODO
8+
Status: Exploratory
9+
10+
## Abstract
11+
12+
Adds new infix comparison operators that are aware of the special nature of the `undef` value, comparing it as distinct from the number zero or the empty string.
13+
14+
## Motivation
15+
16+
Perl has two sets of comparison operators; one set that considers the stringy nature of values, and another that considers numerical values. When comparing against `undef`, the stringy operators treat undef as an empty string, and the numerical operators treat it as zero. In both cases, a warning is produced:
17+
18+
```
19+
Use of uninitialized value in string eq at ...
20+
```
21+
22+
Sometimes it is useful to consider that `undef` is in fact a different value, distinct from any defined string or number - even empty or zero. Currently in Perl, to check if a given value (which may be `undef`) is equal to a given string value, the test must make sure to check for definedness:
23+
24+
```perl
25+
if( defined $x and $x eq $y ) { ... }
26+
```
27+
28+
Furthermore, if the value `$y` can also be undefined, it may be desired that in that case it matches if `$x` is also undefined. In that situation, the two cases must be considered individually:
29+
30+
```perl
31+
if( (!defined $x and !defined $y) or
32+
(defined $x and defined $y and $x eq $y) ) { ... }
33+
```
34+
35+
Countless bugs across countless modules have been caused by failing to check in such a way and using code such as simply `$x eq $y` in these circumstance.
36+
37+
By providing new comparison operators that have different behaviour with undefined values, Perl can provide more convenient choice of behaviours for authors to use.
38+
39+
## Rationale
40+
41+
(explain why the (following) proposed solution will solve it)
42+
43+
## Specification
44+
45+
A new operator, named `equ`, which has the same syntax as regular string equality `eq`, but semantics identical to that given above:
46+
47+
```perl
48+
$x equ $y
49+
50+
# Equivalent to (!defined $x and !defined $y) or
51+
# (defined $x and defined $y and $x eq $y)
52+
```
53+
54+
In particular, given two `undef` values, this operator will yield true. Given one `undef` and one defined value, it yields false, or given two defined values it will yield the same answer that `eq` would. In no circumstance will it provoke a warning of undefined values.
55+
56+
Likewise, a new operator named `===`, which provides the numerical counterpart:
57+
58+
```perl
59+
$x === $y
60+
61+
# Equivalent to (!defined $x and !defined $y) or
62+
# (defined $x and defined $y and $x == $y)
63+
```
64+
65+
Note that while the `===` operator will not provoke warnings about undefined values, it could still warn about strings that do not look like numbers.
66+
67+
## Backwards Compatibility
68+
69+
As these infix operators are currently syntax errors in existing versions of Perl, it is not required that they be feature-guarded.
70+
71+
However, it may still be considered useful to add a feature guard in order to provide some compatibility with existing code which uses the `Syntax::Operator::Equ` CPAN module. Likely this would be named `equ`:
72+
73+
```perl
74+
use feature 'equ';
75+
```
76+
77+
Once the feature becomes stable it is likely this would be included in an appropriate `use VERSION` bundle, so users would not be expected to request it specifically in the long-term.
78+
79+
## Security Implications
80+
81+
No new issues are anticipated.
82+
83+
## Examples
84+
85+
## Prototype Implementation
86+
87+
This operator is already implemented using the pluggable infix operator support of Perl version 5.38, in CPAN module [`Syntax::Operator::Equ`](https://metacpan.org/pod/Syntax::Operator::Equ).
88+
89+
## Future Scope
90+
91+
Considering further the possible idea to provide a `match/case` syntax inspired by [`Syntax::Keyword::Match`](https://metacpan.org/pod/Syntax::Keyword::Match), this operator would be a useful addition in combination with that as well, allowing dispatch on a set of fixed values as well as `undef`:
92+
93+
```perl
94+
match( $x : equ ) {
95+
case(undef) { say "The value in x is undefined" }
96+
case("") { say "The value in x is empty string" }
97+
case("ABC") { say "The value in x is ABC" }
98+
...
99+
}
100+
```
101+
102+
## Rejected Ideas
103+
104+
* It is not possible to unambiguously provide extensions of the ordering operators `cmp` and `<=>` in a similar way, because this requires a decision to be made as to the sorting order of undefined values, compared to any defined string or number. While it may be natural to consider that undef sorts before an empty string, there is no clear choice on where undef would sort compared to (signed) numbers. Would it be before any number, even negative infinity? This would place it far away from zero.
105+
106+
* Likewise, because it is not possible to provide an undef-aware version of `cmp` or `<=>`, the ordering comparison operators of `le`, `lt` and so on are also not provided.
107+
108+
## Open Issues
109+
110+
* Should we also provide negated versions of these operators? While much rarer in practice, it may be useful to provide a "not equ", perhaps spelled `nequ` or `neu`; and likewise `!===` or `!==` for the numerical version. These do not suffer the sorting order problem outlined above for more general comparisons.
111+
112+
* As an entirely alternate proposal, should we instead find ways to apply behaviour-modifying flags to the existing operators? That is, rather than adding a new `equ` and `===` could we instead consider some syntax such as `eq:u` and `==:u` as a modifier flag, similar to the flags on regexp patterns, as a way to modify operators? This would be extensible in a more general way to more operators, while also allowing more flexible flags in future, such as for instance a case-ignoring string comparison to be spelled `eq:i`. This alternate proposal may be the subject of an alternate PPC document.
113+
114+
* How to pronounce the name of this new operator? I suggest "ee-koo", avoiding the "you" part of the sound.
115+
116+
## Copyright
117+
118+
Copyright (C) 2024, Paul Evans.
119+
120+
This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself.

0 commit comments

Comments
 (0)