-
Couldn't load subscription status.
- Fork 126
Use slots in common classes #1434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1434 +/- ##
==========================================
+ Coverage 91.21% 91.22% +0.01%
==========================================
Files 52 52
Lines 7237 7248 +11
Branches 1018 1021 +3
==========================================
+ Hits 6601 6612 +11
Misses 457 457
Partials 179 179 ☔ View full report in Codecov by Sentry. |
|
@gadomski This is ready for review. Some of the changes necessary to get slots working are a bit subtle (happy to explain anything that looks unnecessary), but the surface area of changes is relatively small. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for this ... I ran the benchmark suite and while most stuff was unchanged, there is a (pretty significant) speedup when walking a large-ish catalog:
$ asv compare main feature/use-slots --only-changed
| Change | Before [f23af862] <main> | After [880b6e94] <feature/use-slots> | Ratio | Benchmark (Parameter) |
|----------|----------------------------|----------------------------------------|---------|---------------------------------------------------------------------------------------|
| - | 78.2±0.7μs | 52.2±0.8μs | 0.67 | catalog.WalkCatalogBench.time_walk [new-terrain/virtualenv-py3.12] |
| - | 78.4±0.6μs | 50.5±1μs | 0.64 | catalog.WalkCatalogBench.time_walk [new-terrain/virtualenv-py3.12-orjson] |Couple of questions:
- Do you reckon that this is a breaking change? My understanding of
__slots__is not deep, but the inability to set non-slot attributes at runtime does feel impactful. - Is there a benchmark that we could cook up to test #1207? It'd be nice to give ourselves a target to aim for.
Here are the top offenders in terms of cumulative time for smallish (1000 item) catalog saves: And here are the top offenders with 100x more items: OK, but how do we write a neat little test that isn't completely ad hoc?... I'm not sure. Ad hoc might be the smart play here? So maybe we empirically determine how much we can shave off of this thing with quick/dirty changes on a testing branch and use the results from that plus whatever disgust it elicits in terms of readability and code smell to decide exactly where to aim Sorry if those were both non-answers but in fairness those are two very hard questions |
|
👍🏼 on breaking, I'll put this in the v2.0 milestone and hold off on merging until
Stop it, you're making me blush 😊 |
Related Issue(s):
__slots__for attributes #1229Description:
Because it isn't unexpected that even tens of thousands of
ItemandLinkclasses and a fortioriStacObjectclasses are regularly in memory, there should be some small but appreciable performance improvements from using__slots__rather than__dict__to hold class attributes. The downside of this change is that new attributes cannot be set at runtime that aren't expected in__slots__definitions. In general, the existence ofextra_fieldsonItemclasses makes that a non-issue for the cases we care about.Beyond simply adding
__slots__definitions, this PR also updates the serialization logic to properly use them and adds some__init__logic, where appropriate, to avoid non-initialization of expected__slots__values.#1207 is a related issue but it should be noted that this PR does not fully address the performance issues indicated there, as many duplicated href lookups due to eager mutation of objects remain an issue. IF spilling memory to disk is the cause of the observed exponential increase in runtimes during
.saveis to blame, this PR should provide a bit more breathing room.PR Checklist:
scripts/test)