Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Jun 26, 2025

📄 175% (1.75x) speedup for find_cycle_vertices in src/dsa/nodes.py

⏱️ Runtime : 87.7 milliseconds 31.9 milliseconds (best of 130 runs)

📝 Explanation and details

Great! The biggest bottleneck is nx.simple_cycles, which is known to be slow for large graphs since it finds all simple cycles. However, your code only wants the set of vertices involved in any cycle, not the cycles themselves.

Optimization insight:
We only care about the vertices contained in any cycles, not the cycles themselves.
Therefore, we can use networkx.simple_cycles only as a last resort.
But:

  • It's much faster to first get the "strongly connected components" (SCCs) with more than one vertex — every vertex in such an SCC is part of at least one cycle.
  • Additionally, any self-loop (edge (v,v)) is a cycle by definition.

So, the code can be rewritten to avoid enumerating all cycles and instead just.

  • Find SCCs with >1 vertex
  • Add in all vertices that have a self-loop (these may be missed if they are singleton SCCs)

This avoids listing all cycles.

Here's the optimized code.

Runtime::
Now, the runtime is dominated by the search for SCCs and checking self-loops, which is much more efficient than enumerating all simple cycles.

Behavior:
The result is the same as your original code for all test cases. No changes to the function signature.
All comments are preserved except for omitted inapplicable one regarding cycle enumeration, which is now unnecessary.

Let me know if you want further low-level tuning or Cython rewrite.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 24 Passed
🌀 Generated Regression Tests 65 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_dsa_nodes.py::test_complex_graph 180μs 42.3μs ✅326%
test_dsa_nodes.py::test_cycle_with_extra_nodes_edges 103μs 34.6μs ✅199%
test_dsa_nodes.py::test_figure_eight 146μs 30.4μs ✅383%
test_dsa_nodes.py::test_multiple_disjoint_cycles 117μs 29.5μs ✅297%
test_dsa_nodes.py::test_multiple_overlapping_cycles 148μs 29.9μs ✅396%
test_dsa_nodes.py::test_no_cycles_dag 38.4μs 22.6μs ✅69.9%
test_dsa_nodes.py::test_self_loop 25.5μs 17.8μs ✅43.1%
test_dsa_nodes.py::test_simple_triangle_cycle 79.9μs 23.7μs ✅237%
test_dsa_nodes.py::test_simple_two_node_cycle 67.3μs 21.1μs ✅219%
test_dsa_nodes.py::test_string_vertices 103μs 36.0μs ✅188%
🌀 Generated Regression Tests and Runtime
import random

# function to test
# derived from https://github.com/langflow-ai/langflow/pull/5262
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# -----------------
# Basic Test Cases
# -----------------

def test_empty_graph():
    # No edges, so no cycles
    codeflash_output = find_cycle_vertices([]) # 21.4μs -> 11.9μs (79.7% faster)

def test_single_node_no_edges():
    # One node, no edges: no cycles
    codeflash_output = find_cycle_vertices([(1, 1)]) # 25.0μs -> 17.2μs (45.3% faster)

def test_two_nodes_no_cycle():
    # Two nodes, one direction, no cycle
    codeflash_output = find_cycle_vertices([(1, 2)]) # 32.8μs -> 19.1μs (71.2% faster)

def test_two_nodes_with_cycle():
    # Two nodes, bidirectional, forms a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 66.7μs -> 21.0μs (217% faster)

def test_three_node_cycle():
    # Three nodes in a cycle
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 80.5μs -> 23.4μs (244% faster)

def test_three_node_chain_no_cycle():
    # Three nodes in a chain, no cycles
    edges = [(1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 36.8μs -> 21.5μs (71.1% faster)

def test_multiple_disjoint_cycles():
    # Two disjoint cycles
    edges = [(1, 2), (2, 1), (3, 4), (4, 3)]
    codeflash_output = find_cycle_vertices(edges) # 101μs -> 26.6μs (283% faster)

def test_cycle_and_non_cycle():
    # One cycle, one chain
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 88.3μs -> 28.0μs (216% faster)

def test_self_loop_only():
    # Self-loop is a cycle
    edges = [(1, 1)]
    codeflash_output = find_cycle_vertices(edges) # 24.3μs -> 17.2μs (41.8% faster)

def test_self_loop_and_other_edges():
    # Self-loop and other non-cycling edges
    edges = [(1, 1), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 34.6μs -> 22.4μs (54.5% faster)

def test_cycle_with_self_loop():
    # Cycle plus a self-loop elsewhere
    edges = [(1, 2), (2, 3), (3, 1), (4, 4)]
    codeflash_output = find_cycle_vertices(edges) # 82.2μs -> 26.5μs (210% faster)

def test_multiple_cycles_sharing_vertices():
    # Two cycles sharing a vertex
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 2)]
    # cycles: 1-2-3-1 and 2-4-2
    codeflash_output = find_cycle_vertices(edges) # 125μs -> 27.0μs (367% faster)

# -----------------
# Edge Test Cases
# -----------------

def test_disconnected_graph():
    # Disconnected nodes, no cycles
    edges = [(1, 2), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 40.2μs -> 23.6μs (70.5% faster)

def test_cycle_with_isolated_node():
    # Cycle plus isolated node
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges + [(4, 4)]) # 83.7μs -> 26.2μs (219% faster)

def test_duplicate_edges():
    # Multiple identical edges, should not affect cycle detection
    edges = [(1, 2), (2, 3), (3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 82.5μs -> 24.8μs (233% faster)

def test_non_integer_node_labels():
    # Node labels are strings
    edges = [("a", "b"), ("b", "c"), ("c", "a")]
    codeflash_output = find_cycle_vertices(edges) # 86.5μs -> 24.4μs (254% faster)


def test_cycle_with_parallel_edges():
    # Parallel edges (same edge repeated), should not affect result
    edges = [(1, 2), (2, 3), (3, 1), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 87.8μs -> 26.9μs (227% faster)

def test_large_integer_labels():
    # Large integer node labels
    edges = [(1000000, 2000000), (2000000, 3000000), (3000000, 1000000)]
    codeflash_output = find_cycle_vertices(edges) # 90.2μs -> 26.9μs (236% faster)

def test_cycle_with_extra_edges():
    # Cycle plus extra outgoing edges
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 93.6μs -> 29.1μs (222% faster)

def test_cycle_with_incoming_edges():
    # Cycle with incoming edge from outside
    edges = [(0, 1), (1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 85.6μs -> 26.3μs (226% faster)

def test_cycle_with_multiple_entry_points():
    # Multiple incoming edges to a cycle
    edges = [(0, 1), (1, 2), (2, 3), (3, 1), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 89.8μs -> 28.9μs (211% faster)

def test_graph_with_no_edges():
    # No edges at all
    codeflash_output = find_cycle_vertices([]) # 21.1μs -> 12.2μs (73.0% faster)

def test_graph_with_multiple_self_loops():
    # Multiple self-loops, no other edges
    edges = [(1, 1), (2, 2), (3, 3)]
    codeflash_output = find_cycle_vertices(edges) # 29.1μs -> 23.8μs (22.0% faster)

def test_graph_with_cycle_and_long_chain():
    # Long chain feeding into a cycle
    edges = [(1, 2), (2, 3), (3, 4), (4, 2), (5, 1), (6, 5)]
    # Cycle: 2-3-4-2
    codeflash_output = find_cycle_vertices(edges) # 96.5μs -> 32.0μs (202% faster)

def test_graph_with_cycle_and_branch():
    # Cycle with a branch out of one node
    edges = [(1, 2), (2, 3), (3, 1), (2, 4)]
    codeflash_output = find_cycle_vertices(edges) # 86.3μs -> 26.1μs (231% faster)

def test_graph_with_cycle_and_branch_into_cycle():
    # Branch into a cycle
    edges = [(1, 2), (2, 3), (3, 1), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 84.9μs -> 25.8μs (229% faster)

def test_graph_with_cycle_and_branch_out_of_cycle():
    # Branch out of a cycle
    edges = [(1, 2), (2, 3), (3, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 85.2μs -> 26.0μs (227% faster)

def test_cycle_with_isolated_nodes():
    # Cycle with nodes that are not connected at all
    edges = [(1, 2), (2, 3), (3, 1)]
    # Add isolated nodes
    codeflash_output = find_cycle_vertices(edges + [(4, 5)]) # 88.7μs -> 27.9μs (218% faster)

# -----------------
# Large Scale Test Cases
# -----------------

def test_large_chain_no_cycles():
    # Large chain, no cycles
    n = 1000
    edges = [(i, i + 1) for i in range(n)]
    codeflash_output = find_cycle_vertices(edges) # 3.96ms -> 2.23ms (77.5% faster)

def test_large_cycle():
    # One large cycle
    n = 1000
    edges = [(i, i + 1) for i in range(n)]
    edges.append((n, 0))  # closes the cycle
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.38ms (410% faster)

def test_large_disjoint_cycles():
    # Multiple disjoint cycles
    n = 500
    edges = []
    for i in range(0, 1000, 2):
        edges += [(i, i + 1), (i + 1, i)]
    codeflash_output = find_cycle_vertices(edges)

def test_large_cycle_with_noise():
    # Large cycle with extra non-cycling edges
    n = 500
    edges = [(i, i + 1) for i in range(n)]
    edges.append((n, 0))  # closes the cycle
    # Add some noise edges
    edges += [(i, n + i) for i in range(1, 100)]
    codeflash_output = find_cycle_vertices(edges) # 6.66ms -> 1.44ms (363% faster)

def test_large_graph_sparse_cycles():
    # Large graph, only a few small cycles
    n = 1000
    edges = [(i, i + 1) for i in range(n - 5)]
    # Add three small cycles
    edges += [(100, 101), (101, 102), (102, 100)]
    edges += [(200, 201), (201, 200)]
    edges += [(300, 300)]
    expected = [100, 101, 102, 200, 201, 300]
    codeflash_output = find_cycle_vertices(edges) # 3.98ms -> 2.21ms (80.4% faster)

def test_large_graph_random_edges_with_known_cycle():
    # Random edges, plus a known cycle
    n = 900
    random.seed(42)
    edges = [(random.randint(0, n), random.randint(0, n)) for _ in range(500)]
    # Add a known cycle
    cycle = [950, 951, 952, 953, 950]
    edges += [(cycle[i], cycle[i+1]) for i in range(len(cycle)-1)]

def test_large_graph_all_self_loops():
    # All nodes have self-loops
    n = 1000
    edges = [(i, i) for i in range(n)]
    codeflash_output = find_cycle_vertices(edges) # 1.66ms -> 2.46ms (32.4% slower)

def test_large_graph_no_cycles():
    # Large, acyclic graph (tree)
    n = 1000
    edges = [(i, i + 1) for i in range(n - 1)]
    codeflash_output = find_cycle_vertices(edges) # 3.91ms -> 2.20ms (77.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ----------------------------
# BASIC TEST CASES
# ----------------------------

def test_empty_graph():
    # No edges, so no cycles
    codeflash_output = find_cycle_vertices([]) # 22.4μs -> 12.5μs (79.6% faster)

def test_single_node_no_edges():
    # Single node, no edges, should return empty
    codeflash_output = find_cycle_vertices([(1, 1)]) # 25.3μs -> 17.8μs (42.1% faster)

def test_single_self_loop():
    # A single node with a self-loop is a cycle
    codeflash_output = find_cycle_vertices([(42, 42)]) # 25.2μs -> 17.6μs (43.3% faster)

def test_two_nodes_no_cycle():
    # Two nodes, one direction, no cycle
    codeflash_output = find_cycle_vertices([(1, 2)]) # 33.5μs -> 19.6μs (71.3% faster)

def test_two_nodes_with_cycle():
    # Two nodes, both directions, forms a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 69.5μs -> 21.3μs (226% faster)

def test_three_nodes_cycle():
    # Three nodes in a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 81.8μs -> 24.0μs (240% faster)

def test_three_nodes_no_cycle():
    # Three nodes, chain, no cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3)]) # 37.1μs -> 21.7μs (70.8% faster)

def test_disconnected_graph_with_one_cycle():
    # Disconnected nodes, only one part has a cycle
    edges = [(1, 2), (2, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 75.0μs -> 25.5μs (194% faster)

def test_multiple_cycles_disjoint():
    # Two separate cycles
    edges = [(1, 2), (2, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 118μs -> 29.6μs (298% faster)

def test_multiple_cycles_overlapping():
    # Overlapping cycles: 1->2->3->1 and 2->3->4->2
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 124μs -> 27.2μs (357% faster)

def test_cycle_with_extra_edges():
    # Cycle with extra outgoing/incoming edges
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 91.1μs -> 28.4μs (221% faster)

# ----------------------------
# EDGE TEST CASES
# ----------------------------

def test_duplicate_edges():
    # Duplicate edges should not affect result
    edges = [(1, 2), (2, 1), (1, 2), (2, 1)]
    codeflash_output = find_cycle_vertices(edges) # 67.7μs -> 22.0μs (208% faster)

def test_cycle_with_self_loop():
    # Node in a cycle also has a self-loop
    edges = [(1, 2), (2, 1), (2, 2)]
    # Node 2 is in a 2-cycle and also a self-loop
    codeflash_output = find_cycle_vertices(edges) # 67.5μs -> 21.9μs (209% faster)

def test_isolated_nodes():
    # Graph with isolated nodes and a cycle elsewhere
    edges = [(1, 2), (2, 1), (3, 4), (5, 5)]
    # 1,2 are in a cycle, 5 is a self-loop (cycle), 3,4 are not
    codeflash_output = find_cycle_vertices(edges) # 77.7μs -> 28.4μs (174% faster)

def test_cycle_with_non_integer_nodes():
    # Nodes can be strings
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 95.0μs -> 30.1μs (215% faster)

def test_cycle_with_mixed_types():
    # Nodes of different types (int, str, tuple)
    edges = [(1, "a"), ("a", (2,)), ((2,), 1)]
    # Should handle and sort mixed types (will raise TypeError in sorted)
    # So we expect a TypeError
    with pytest.raises(TypeError):
        find_cycle_vertices(edges)

def test_graph_with_no_cycles():
    # Large acyclic graph (tree)
    edges = [(i, i+1) for i in range(10)]
    codeflash_output = find_cycle_vertices(edges) # 74.9μs -> 42.6μs (75.9% faster)

def test_graph_with_all_self_loops():
    # Each node has a self-loop
    edges = [(i, i) for i in range(5)]
    codeflash_output = find_cycle_vertices(edges) # 32.5μs -> 28.6μs (13.7% faster)

def test_graph_with_cycle_and_isolated_self_loop():
    # A cycle and a node with a self-loop, not connected
    edges = [(1, 2), (2, 3), (3, 1), (4, 4)]
    codeflash_output = find_cycle_vertices(edges) # 82.8μs -> 26.5μs (213% faster)

def test_graph_with_cycle_and_disconnected_node():
    # A cycle and a completely disconnected node
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges + [(4, 5)]) # 88.6μs -> 28.0μs (217% faster)

def test_graph_with_parallel_edges():
    # Parallel edges between same nodes
    edges = [(1, 2), (1, 2), (2, 1), (2, 1)]
    codeflash_output = find_cycle_vertices(edges) # 66.9μs -> 21.9μs (206% faster)

# ----------------------------
# LARGE SCALE TEST CASES
# ----------------------------

def test_large_linear_chain_no_cycles():
    # Large chain, no cycles
    edges = [(i, i+1) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 3.95ms -> 2.22ms (78.2% faster)

def test_large_single_cycle():
    # Large cycle of 1000 nodes
    edges = [(i, i+1) for i in range(1000)]
    edges.append((1000, 0))
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.38ms (409% faster)

def test_large_multiple_small_cycles():
    # 100 cycles of length 10 (disjoint)
    edges = []
    for base in range(0, 1000, 10):
        for i in range(10):
            edges.append((base + i, base + (i+1)%10 + base))
    # All nodes are in cycles
    expected = list(range(0, 1000))
    codeflash_output = find_cycle_vertices(edges)

def test_large_graph_with_some_cycles():
    # 500 nodes in a chain, then a cycle of 500 nodes
    chain = [(i, i+1) for i in range(500)]
    cycle = [(i, i+1) for i in range(500, 999)]
    cycle.append((999, 500))
    edges = chain + cycle
    expected = list(range(500, 1000))
    codeflash_output = find_cycle_vertices(edges) # 8.15ms -> 2.29ms (255% faster)

def test_large_sparse_graph_with_few_cycles():
    # 950 nodes in a chain, 5 cycles of length 10
    chain = [(i, i+1) for i in range(950)]
    cycles = []
    for base in range(950, 1000, 10):
        for i in range(10):
            cycles.append((base + i, base + (i+1)%10 + base))
    edges = chain + cycles
    expected = list(range(950, 1000))
    codeflash_output = find_cycle_vertices(edges)

def test_large_graph_all_self_loops():
    # 1000 nodes, each with a self-loop
    edges = [(i, i) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 1.68ms -> 2.45ms (31.6% slower)

def test_large_graph_no_edges():
    # 1000 nodes, no edges
    edges = []
    codeflash_output = find_cycle_vertices(edges) # 21.2μs -> 12.0μs (76.1% faster)

def test_large_graph_with_cycle_and_isolated_nodes():
    # 995 isolated nodes, 5-node cycle
    cycle = [(1000, 1001), (1001, 1002), (1002, 1003), (1003, 1004), (1004, 1000)]
    edges = cycle
    codeflash_output = find_cycle_vertices(edges) # 113μs -> 31.5μs (260% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_cycle_vertices-mce0k6pj and push.

Codeflash

Great! The biggest bottleneck is `nx.simple_cycles`, which is known to be slow for large graphs since it finds **all** simple cycles. However, your code only wants the **set of vertices involved in any cycle**, not the cycles themselves.

**Optimization insight:**  
We only care about the *vertices* contained in any cycles, not the cycles themselves.  
Therefore, we can use `networkx.simple_cycles` only as a last resort.  
But:  
* It's *much* faster to first get the "strongly connected components" (SCCs) with more than one vertex — every vertex in such an SCC is part of at least one cycle.
* Additionally, any self-loop (edge `(v,v)`) is a cycle by definition.

So, the code can be rewritten to avoid enumerating all cycles and instead just.
- Find SCCs with >1 vertex
- Add in all vertices that have a self-loop (these may be missed if they are singleton SCCs)

This avoids listing *all* cycles.

Here's the optimized code.



**Runtime::**  
Now, the runtime is dominated by the search for SCCs and checking self-loops, which is much more efficient than enumerating all simple cycles.
  
**Behavior:**  
The result is the same as your original code for all test cases. No changes to the function signature.  
All comments are preserved except for omitted inapplicable one regarding cycle enumeration, which is now unnecessary.

**Let me know if you want further low-level tuning or Cython rewrite.**
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 26, 2025
@codeflash-ai codeflash-ai bot requested a review from KRRT7 June 26, 2025 23:27
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-find_cycle_vertices-mce0k6pj branch June 27, 2025 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant