Skip to content

Conversation

@markshannon
Copy link
Member

@markshannon markshannon commented Apr 8, 2024

This PR changes the specialization of binary operators for the current, fairly ad-hoc approach to a bit more principled approach of specializing by refcount and using table lookup to specialize by type.

Tier 1 performance is in the noise, maybe showing a slight slowdown.

I expect the advantages of this approach to show up when applied to BINARY_SUBSCR, COMPARE_OP and with the JIT.

It also opens the possibility of specializing binary operators for numpy arrays and the Decimal class.

unused/1 +
_GUARD_NOS_REFCNT1 +
_GUARD_TOS_IMMORTAL +
_BINARY_OP_TABLE_NN;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please document in the bytecodes.c file what these suffixes mean? Some of them are easy to see, but I don't know what's NN vs ND immediately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@markshannon markshannon marked this pull request as ready for review May 21, 2024 19:54
@rhettinger rhettinger removed their request for review May 21, 2024 21:12
@markshannon
Copy link
Member Author

Performance on both tier 1 and tier 2 is about 1% slower.
With deferred reference counting, there will be no need for specializing on refcount, so I'm deferring this until deferred reference counting is done.

@markshannon
Copy link
Member Author

Earlier results had an issue with the telco benchmark, probably due to the decimal module falling back to the pure Python version.

Benchmarking on tier 1 shows the slowdown to be in the noise.
The significant results are a speedup of 7% for spectral_norm and a slowdown of 12% for nbody.
This is presumably due to more complete specialization for spectral_norm and additional overhead for the already well specialized nbody code.
We expect the advantage of better specialization to be important in tier 2, but the additional tier 1 overhead to be largely irrelevant there.

@markshannon markshannon reopened this May 29, 2024
@brandtbucher brandtbucher self-requested a review June 6, 2024 16:48
@brettcannon brettcannon removed their request for review June 11, 2024 20:34
@markshannon
Copy link
Member Author

Abandoning this as #128722 is handling the same problem and we have a better ideas for reducing refcounting overhead faster-cpython/ideas#700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants