-
Notifications
You must be signed in to change notification settings - Fork 5
Experimental optimisations, proposed by ChatGPT. #337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #337 +/- ##
==========================================
- Coverage 95.78% 95.67% -0.12%
==========================================
Files 27 27
Lines 1638 1618 -20
==========================================
- Hits 1569 1548 -21
- Misses 69 70 +1
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…rk into fandango_experiment
I've been through this. Most of the optimisations were fine. Some were a little odd. I also took the opportunity to add some extra ones of my own. Also, most notably, I made this class a proper subclass of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions look all reasonable to me. This is exactly what I was looking into some time ago when I tried to speed up the identification for large DAGs (see related issue #259). The use of generators makes sense here, and removing the NetworkX bottlenecks will also likely speed things up. But without doing a detailed profiling analysis into the before/after, we won't know exactly how much this will have optimised the identification of large DAGs.
My admittedly limited test runs indicate not much! 🙃 But at least the code is cleaner now. |
Prompt: "Can you speed this up?", pasting CausalDag.enumerate_minimal_adjustment_sets
Summary of changes:
moral_graph[t] instead of nx.neighbors(...)
Accesses adjacency dict directly, faster in memory
update(...) instead of union with comprehension
Avoids building intermediate sets
Eliminated intermediate list conversions
Avoids unnecessary copies
Variable renaming (pbd_graph, ancestor_graph, etc.)
Improves clarity and prevents redundant calls
Same applied to list_all_min_sep
Summary of changes:
graph[node] instead of nx.neighbors(graph, node)
Faster adjacency access in memory
treatment_component = ... break
Avoids unnecessary iteration after finding the component
sample(sorted(...), 1)[0]
Returns a value, not a set, avoiding later unpacking
Removed repeated set(...) wrapping
Reduces GC and memory allocations
Same applied to constructive_backdoor_criterion
Avoids repeated set(self.nodes)
self.nodes is probably already a set-like iterable
Combines descendant updates efficiently
Reduces overhead of set.union with unpacking
Avoids constructing logger message unless needed
Significant savings if logging level > INFO
Clearer and faster condition check with & (set intersection)
Cleaner than issubset(difference(...))