⚡️ Speed up function find_last_node by 9,058%
#263
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 9,058% (90.58x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
12.3 milliseconds→134 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 9057% speedup (from 12.3ms to 134μs) by replacing a quadratic O(N×M) algorithm with a linear O(N+M) algorithm, where N is the number of nodes and M is the number of edges.
Key optimization:
The original code uses a nested loop structure: for each node, it iterates through all edges to check if that node appears as a source. This results in O(N×M) comparisons.
The optimized version builds a set of source IDs from edges in a single pass (
sources = {e["source"] for e in edges}), then performs O(1) membership checks (n["id"] not in sources) for each node. This reduces complexity to O(N+M).Why this is faster:
test_large_scale_chain_flowshows 16,642% speedup (4.43ms → 26.5μs) andtest_large_complete_graph_with_sinkshows 10,155% speedup (1.62ms → 15.8μs)Edge case handling:
The optimization includes safeguards:
edgesis a consumed iterator (iter(edges) is edges) and falls back to original logic to preserve correctnessTypeErrorand falls back to the original nested approachPerformance impact:
The optimization is particularly valuable when
find_last_nodeis called repeatedly on non-trivial graphs, as the linear algorithm scales far better than the quadratic baseline.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mkq2j701and push.