Skip to content

Performance improvements for the MATCH_CLASS opcode #138912

@cdce8p

Description

@cdce8p

Feature or enhancement

Proposal:

For pylint we recently dropped support for Python 3.9 so I'm currently evaluating if we could replace some if statements with match. As most of the checks are similar to AST matching, the class pattern makes a lot of sense. I've noticed though that these are noticeable slower than the corresponding if statements. Yes, technically match does a few more checks which are just omitted in the if clauses but after looking at _PyEval_MatchClass I believe there is room for additional improvements.

Bare class pattern

First let's compare a simple isinstance call with the equivalent class pattern (without any attributes).

if isinstance(a, A):
    return True
return False

# ---
match a:
    case A():
        return True
return False
Full micro benchmark
from timeit import timeit
setup = """
class A:
    def __init__(self, x, y):
        self.x = x
        self.y = y
a = A(1, 2)

def f1(a):
    if isinstance(a, A):
        return True
    return False

def f2(a):
    match a:
        case A():
            return True
    return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If      0.29646458296338096
Match   0.6549855829798616

After
If      0.2926184160169214
Match   0.3023994160466827

At least on my machine the match statement is 2x slower. This could be avoided if we return early in case there are no arguments, thus skipping the creation of a list to track the attributes and set for the attribute names.

cpython/Python/ceval.c

Lines 743 to 752 in 2d72493

// So far so good:
PyObject *seen = PySet_New(NULL);
if (seen == NULL) {
return NULL;
}
PyObject *attrs = PyList_New(0);
if (attrs == NULL) {
Py_DECREF(seen);
return NULL;
}

One attribute

One of the most expensive parts of MatchClass is the set to check for duplicate attribute names. If there is only one attribute, this could be skipped entirely. An additional perf improvement can be archived if the attributes are placed into a tuple directly instead of creating an intermediary list and converting it to a tuple before returning it. That works since the match would fail if the argument count doesn't match.

Full micro benchmark
from timeit import timeit
setup = """
class A:
    def __init__(self, x, y):
        self.x = x
        self.y = y
a = A(1, 2)

def f1(a):
    if isinstance(a, A) and a.x == 1:
        return True
    return False

def f2(a):
    match a:
        case A(x=1):
            return True
    return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If      0.35557704203529283
Match   1.1494590829825029

After
If      0.349508750019595
Match   0.6244700000388548

Only keyword arguments

The check for duplicate attribute names is most helpful for mixed positional and keyword class patterns. Users would need to lookup __match_args__ and make sure an attribute isn't accidentally specified twice. However, given the more than noticeable performance impact, I'd argue it could be skipped if there are only keywords given. The worst that would happen is that the match-case would always fail due to conflicting subpatterns. In that case however it would be fairly easy to debug as all keywords are specified directly. Furthermore linters will be able to detect these based on the AST alone.

Full micro benchmark
from timeit import timeit
setup = """
class A:
    def __init__(self, x, y):
        self.x = x
        self.y = y
a = A(1, 2)

def f1(a):
    if isinstance(a, A) and a.x == 1 and a.y == 2:
        return True
    return False

def f2(a):
    match a:
        case A(x=1, y=2):
            return True
    return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
Before
If      0.4226177910459228
Match   1.4355440840008669

After
If      0.41212291695410386
Match   0.7927287500351667

--
I'll be working on a PR for these.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions