-
-
Notifications
You must be signed in to change notification settings - Fork 33k
Description
Feature or enhancement
Proposal:
For pylint we recently dropped support for Python 3.9 so I'm currently evaluating if we could replace some if statements with match. As most of the checks are similar to AST matching, the class pattern makes a lot of sense. I've noticed though that these are noticeable slower than the corresponding if statements. Yes, technically match does a few more checks which are just omitted in the if clauses but after looking at _PyEval_MatchClass
I believe there is room for additional improvements.
Bare class pattern
First let's compare a simple isinstance
call with the equivalent class pattern (without any attributes).
if isinstance(a, A):
return True
return False
# ---
match a:
case A():
return True
return False
Full micro benchmark
from timeit import timeit
setup = """
class A:
def __init__(self, x, y):
self.x = x
self.y = y
a = A(1, 2)
def f1(a):
if isinstance(a, A):
return True
return False
def f2(a):
match a:
case A():
return True
return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If 0.29646458296338096
Match 0.6549855829798616
After
If 0.2926184160169214
Match 0.3023994160466827
At least on my machine the match statement is 2x slower. This could be avoided if we return early in case there are no arguments, thus skipping the creation of a list to track the attributes and set for the attribute names.
Lines 743 to 752 in 2d72493
// So far so good: | |
PyObject *seen = PySet_New(NULL); | |
if (seen == NULL) { | |
return NULL; | |
} | |
PyObject *attrs = PyList_New(0); | |
if (attrs == NULL) { | |
Py_DECREF(seen); | |
return NULL; | |
} |
One attribute
One of the most expensive parts of MatchClass
is the set to check for duplicate attribute names. If there is only one attribute, this could be skipped entirely. An additional perf improvement can be archived if the attributes are placed into a tuple directly instead of creating an intermediary list and converting it to a tuple before returning it. That works since the match would fail if the argument count doesn't match.
Full micro benchmark
from timeit import timeit
setup = """
class A:
def __init__(self, x, y):
self.x = x
self.y = y
a = A(1, 2)
def f1(a):
if isinstance(a, A) and a.x == 1:
return True
return False
def f2(a):
match a:
case A(x=1):
return True
return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If 0.35557704203529283
Match 1.1494590829825029
After
If 0.349508750019595
Match 0.6244700000388548
Only keyword arguments
The check for duplicate attribute names is most helpful for mixed positional and keyword class patterns. Users would need to lookup __match_args__
and make sure an attribute isn't accidentally specified twice. However, given the more than noticeable performance impact, I'd argue it could be skipped if there are only keywords given. The worst that would happen is that the match-case would always fail due to conflicting subpatterns. In that case however it would be fairly easy to debug as all keywords are specified directly. Furthermore linters will be able to detect these based on the AST alone.
Full micro benchmark
from timeit import timeit
setup = """
class A:
def __init__(self, x, y):
self.x = x
self.y = y
a = A(1, 2)
def f1(a):
if isinstance(a, A) and a.x == 1 and a.y == 2:
return True
return False
def f2(a):
match a:
case A(x=1, y=2):
return True
return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If 0.4226177910459228
Match 1.4355440840008669
After
If 0.41212291695410386
Match 0.7927287500351667
--
I'll be working on a PR for these.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response