⚡️ Speed up function find_last_node by 3,416%
#262
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 3,416% (34.16x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
7.61 milliseconds→217 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 34x speedup by replacing the original O(n×m) nested iteration with an O(n+m) algorithm using a set for constant-time lookups.
Key Optimization
Original approach: For each node, the code iterates through all edges to check if that node is a source. With n nodes and m edges, this results in O(n×m) comparisons.
Optimized approach:
This is dramatically faster when dealing with large graphs, as evidenced by the test results:
test_large_scale_single_sink_near_end: 4.63ms → 49.1μs (93x faster)test_large_complete_graph_ordering: 1.54ms → 45.9μs (33x faster)test_large_fan_in_structure: 453μs → 15.9μs (28x faster)Edge Cases Handled
The optimization preserves correctness for several edge cases:
Unhashable IDs: When node/edge IDs are unhashable (e.g., lists), the set creation fails. The code catches
TypeErrorand falls back to the original algorithm.Single-use iterators: If
edgesis an iterator that can only be consumed once (whereiter(edges) is edges), the code uses a manual iteration approach that respects the one-time consumption pattern.Non-iterable edges: Catches
TypeErrorwhenedgescannot be iterated and falls back gracefully.When This Matters
The optimization provides the most benefit when:
For small graphs (< 10 nodes/edges), the overhead of set creation may slightly slow performance, but test results show the optimization still provides modest gains even in these cases.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mkp4orlaand push.