Skip to content

Conversation

davidhewitt
Copy link
Member

Inspired by pydantic/pydantic#12240

It looks like handle_kwargs step of function argument extraction is currently slow with long argument lists, I think ~ O(n^2). In general it looks like our handling of many kwargs is slower than CPython, which is not good.

This PR attempts to improve the situation by generating optimized lookup-by-index code for each kwarg. First commit is a benchmark, I'll split that and merge separately.

Comment on lines +173 to +175
fn argument_lookup_by_name(name: &str) -> Option<usize> {
PARAMETER_NAMES.iter().position(|&n| n == name)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From godbolt it looks like the compiler is pretty smart about optimizing this, because PARAMETER_NAMES is known at compile time it'll do tricks like check the length of name and then search the individual bytes of name to generate a matching value, rather than repeated string comparisons.

@davidhewitt davidhewitt force-pushed the optimize-kwarg-extraction branch from cd433c1 to f13f825 Compare September 20, 2025 18:16
@davidhewitt
Copy link
Member Author

I'm unsure how I feel about this optimization. On the one hand, it was fun to experiment here and it seems desirable to have performant argument parsing. On the other hand, this is going to be some additional codegen for every #[pyfunction], additional complexity, and arguably having too many arguments to a function is often bad design anyway.

The benchmark doesn't show much diff at the top level because there's a fair bit of CPython machinery going on to perform function calling etc.

In the argument parsing code itself, this is ~30% faster on this benchmark. Which is nice, but also not anything revolutionary.

image

Would welcome others' opinions on going down this road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant