Commit 97ac07d
authored
Optimize find_last_node
The optimized code achieves a **6873% speedup** by replacing an O(n×m) nested loop with an O(m+n) set-based lookup, where n is the number of nodes and m is the number of edges.
## Key Optimization
**Original approach:** For each node, the code iterates through ALL edges to check if that node is a source:
```python
all(e["source"] != n["id"] for e in edges)
```
This creates n×m comparisons in the worst case.
**Optimized approach:** Build a set of all source IDs once, then use O(1) membership tests:
```python
sources = {e["source"] for e in edges}
# Then: n["id"] not in sources
```
This reduces complexity from O(n×m) to O(m+n).
## Performance Impact by Scale
The speedup grows dramatically with input size:
- **Small graphs (3-4 nodes):** 20-65% faster
- **Medium graphs (50-100 nodes):** 800-2,250% faster
- **Large graphs (500+ nodes):** 8,500-13,000% faster
This is because the quadratic behavior of the original code becomes increasingly expensive as the number of nodes and edges grows.
## Edge Cases Preserved
The optimization maintains original behavior through careful handling:
1. **Empty edges:** When `edges = []`, the set `sources` is empty. The code returns the first node without accessing `n["id"]`, matching the original's lazy evaluation via `all()` on an empty sequence.
2. **Unhashable sources:** A try-except catches `TypeError` if edge sources aren't hashable (rare but possible), falling back to the original logic.
3. **Missing keys:** Both versions raise `KeyError` when nodes lack 'id' keys or edges lack 'source' keys, but only when those keys are actually accessed.
## When This Optimization Matters Most
Based on test results and typical graph algorithm usage, this optimization is particularly valuable when:
- The function is called repeatedly in a loop or hot path
- Processing graphs with >50 nodes/edges
- Working with data flow diagrams, DAGs, or workflow systems where finding terminal nodes is common1 parent c2b37f9 commit 97ac07d
1 file changed
+11
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
51 | 61 | | |
52 | 62 | | |
53 | 63 | | |
| |||
0 commit comments