From d184718e05b3ea1e14ba9e2eab0c0dfd820dd514 Mon Sep 17 00:00:00 2001 From: "codeflash-ai[bot]" <148906541+codeflash-ai[bot]@users.noreply.github.com> Date: Tue, 23 Dec 2025 18:00:55 +0000 Subject: [PATCH] Optimize find_last_node The optimized code achieves a **160x speedup** by eliminating a nested loop that created **O(n*m) complexity** where n = number of nodes and m = number of edges. ## Key Optimization **Original approach**: For each node, iterate through ALL edges to check if that node is a source - For each of the n nodes, checks all m edges: `all(e["source"] != n["id"] for e in edges)` - Time complexity: **O(n * m)** **Optimized approach**: Pre-compute all source node IDs once into a set, then do O(1) lookups - Build a set of all source IDs: `sources = {e["source"] for e in edges}` - O(m) - Check each node against the set: `n["id"] not in sources` - O(1) per node - Time complexity: **O(n + m)** ## Why This Is Faster 1. **Set lookup is O(1)** vs iterating through all edges which is O(m) 2. **Single pass through edges** instead of scanning them repeatedly for each node 3. **Hash-based membership testing** (`in` operator on sets) is dramatically faster than list iteration ## Performance Impact by Test Case The optimization shines particularly well with: - **Large graphs with many edges** (linear chains: 33,000% faster, complete graphs: 8,857% faster) - **Graphs where the last node appears late** in the nodes list (forces the original code to check many nodes) - Test cases show consistent 50-100% speedups on small graphs, but **exponential gains** (thousands of percent) on graphs with 500+ nodes/edges Even on tiny graphs (2-3 nodes), the optimization provides 25-100% speedups, demonstrating the overhead of nested iteration even at small scales. --- src/algorithms/graph.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/algorithms/graph.py b/src/algorithms/graph.py index 777ea3b..f23d356 100644 --- a/src/algorithms/graph.py +++ b/src/algorithms/graph.py @@ -47,7 +47,8 @@ def find_shortest_path(self, start: str, end: str) -> list[str]: def find_last_node(nodes, edges): """This function receives a flow and returns the last node.""" - return next((n for n in nodes if all(e["source"] != n["id"] for e in edges)), None) + sources = {e["source"] for e in edges} + return next((n for n in nodes if n["id"] not in sources), None) def find_leaf_nodes(nodes: list[dict], edges: list[dict]) -> list[dict]: