Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 22, 2025

📄 10% (0.10x) speedup for gcd_recursive in src/math/computation.py

⏱️ Runtime : 69.3 microseconds 63.1 microseconds (best of 206 runs)

📝 Explanation and details

The optimization converts a recursive implementation to an iterative one by replacing the recursive call with a while loop. The key change is eliminating the function call overhead that occurs with each recursive step.

What was changed:

  • Replaced if b == 0: return a and return gcd_recursive(b, a % b) with while b != 0: a, b = b, a % b
  • Moved the final return a outside the loop

Why it's faster:

  • Eliminates function call overhead: Each recursive call creates a new stack frame, which involves parameter passing, stack allocation, and return handling. The iterative version processes all steps in a single function call.
  • Reduces memory pressure: Recursion builds up stack frames that consume memory, while iteration uses constant space with just two variables.
  • Better CPU cache utilization: Staying within the same function keeps code and data more cache-friendly.

Performance characteristics from tests:

  • Most effective on cases requiring multiple algorithm steps (e.g., Fibonacci numbers showing 16.2% speedup)
  • Consistent 5-25% improvements across most test cases
  • Best gains on coprime numbers and cases with many reduction steps
  • Minimal impact on trivial cases (zero inputs, identical numbers) since they require fewer iterations

The 9% overall speedup comes from eliminating Python's function call mechanism while preserving the exact same mathematical algorithm.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 96 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🔮 Hypothesis Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from src.math.computation import gcd_recursive

# unit tests

# 1. Basic Test Cases

def test_gcd_basic_coprime():
    # Coprime numbers should return 1
    codeflash_output = gcd_recursive(13, 17) # 917ns -> 833ns (10.1% faster)
    codeflash_output = gcd_recursive(29, 31) # 583ns -> 459ns (27.0% faster)

def test_gcd_basic_common_divisor():
    # Numbers with a common divisor
    codeflash_output = gcd_recursive(12, 8) # 667ns -> 666ns (0.150% faster)
    codeflash_output = gcd_recursive(100, 80) # 375ns -> 333ns (12.6% faster)

def test_gcd_basic_same_number():
    # GCD of a number with itself is the number
    codeflash_output = gcd_recursive(15, 15) # 542ns -> 500ns (8.40% faster)
    codeflash_output = gcd_recursive(0, 0) # 291ns -> 250ns (16.4% faster)

def test_gcd_basic_one_zero():
    # GCD with zero should return the absolute value of the other number
    codeflash_output = gcd_recursive(0, 5) # 666ns -> 583ns (14.2% faster)
    codeflash_output = gcd_recursive(7, 0) # 250ns -> 250ns (0.000% faster)

def test_gcd_basic_one_one():
    # GCD of 1 and any number is 1
    codeflash_output = gcd_recursive(1, 999) # 792ns -> 750ns (5.60% faster)
    codeflash_output = gcd_recursive(999, 1) # 417ns -> 375ns (11.2% faster)

# 2. Edge Test Cases

def test_gcd_negative_numbers():
    # GCD should be positive even if inputs are negative
    codeflash_output = gcd_recursive(-12, 8) # 791ns -> 666ns (18.8% faster)
    codeflash_output = gcd_recursive(12, -8) # 625ns -> 500ns (25.0% faster)
    codeflash_output = gcd_recursive(-12, -8) # 500ns -> 417ns (19.9% faster)

def test_gcd_zero_and_zero():
    # GCD(0, 0) is mathematically undefined, but function returns 0
    codeflash_output = gcd_recursive(0, 0) # 334ns -> 375ns (10.9% slower)

def test_gcd_large_prime_and_one():
    # GCD of a large prime and 1 is 1
    codeflash_output = gcd_recursive(982451653, 1) # 583ns -> 583ns (0.000% faster)

def test_gcd_one_and_zero():
    # GCD(1, 0) and GCD(0, 1) should be 1
    codeflash_output = gcd_recursive(1, 0) # 333ns -> 334ns (0.299% slower)
    codeflash_output = gcd_recursive(0, 1) # 542ns -> 500ns (8.40% faster)

def test_gcd_negative_and_zero():
    # GCD with negative and zero
    codeflash_output = gcd_recursive(-7, 0) # 333ns -> 375ns (11.2% slower)
    codeflash_output = gcd_recursive(0, -7) # 625ns -> 500ns (25.0% faster)

def test_gcd_large_negative_numbers():
    # Large negative numbers
    codeflash_output = gcd_recursive(-1000000, -500000) # 708ns -> 583ns (21.4% faster)

def test_gcd_min_int_values():
    # Edge case for minimum integer values
    min_int = -2**31
    codeflash_output = gcd_recursive(min_int, min_int) # 1.21μs -> 1.25μs (3.28% slower)
    codeflash_output = gcd_recursive(min_int, 0) # 291ns -> 291ns (0.000% faster)
    codeflash_output = gcd_recursive(0, min_int) # 459ns -> 375ns (22.4% faster)

def test_gcd_max_int_values():
    # Edge case for maximum integer values
    max_int = 2**31 - 1
    codeflash_output = gcd_recursive(max_int, max_int) # 917ns -> 834ns (9.95% faster)
    codeflash_output = gcd_recursive(max_int, 0) # 250ns -> 250ns (0.000% faster)
    codeflash_output = gcd_recursive(0, max_int) # 417ns -> 292ns (42.8% faster)

def test_gcd_order_invariance():
    # GCD should be invariant to argument order
    codeflash_output = gcd_recursive(48, 18) # 875ns -> 833ns (5.04% faster)
    codeflash_output = gcd_recursive(-48, 18) # 541ns -> 416ns (30.0% faster)

def test_gcd_with_zero_and_negative():
    # GCD with zero and negative
    codeflash_output = gcd_recursive(0, -10) # 667ns -> 583ns (14.4% faster)
    codeflash_output = gcd_recursive(-10, 0) # 250ns -> 250ns (0.000% faster)

# 3. Large Scale Test Cases

def test_gcd_large_numbers():
    # Very large numbers
    a = 12345678901234567890
    b = 98765432109876543210
    # Their GCD is 900000000090
    codeflash_output = gcd_recursive(a, b) # 1.58μs -> 1.46μs (8.50% faster)

def test_gcd_large_coprime():
    # Large coprime numbers
    a = 1000000007  # prime
    b = 1000000009  # prime
    codeflash_output = gcd_recursive(a, b) # 1.12μs -> 1.00μs (12.5% faster)

def test_gcd_large_power_of_two():
    # GCD of large powers of two
    a = 2**100
    b = 2**80
    codeflash_output = gcd_recursive(a, b) # 958ns -> 916ns (4.59% faster)

def test_gcd_large_power_of_two_and_one():
    # GCD of large power of two and odd number
    a = 2**500
    b = 3
    codeflash_output = gcd_recursive(a, b) # 917ns -> 875ns (4.80% faster)

def test_gcd_large_numbers_with_common_factor():
    # Large numbers with a known common factor
    a = 999999999 * 123456789
    b = 999999999 * 987654321
    codeflash_output = gcd_recursive(a, b) # 1.46μs -> 1.42μs (3.04% faster)

def test_gcd_large_negative_and_positive():
    # Large negative and positive numbers
    a = -2**60
    b = 2**55
    codeflash_output = gcd_recursive(a, b) # 917ns -> 875ns (4.80% faster)

def test_gcd_large_random_numbers():
    # Random large numbers with known GCD
    a = 100000000000000003 * 37
    b = 100000000000000003 * 111
    codeflash_output = gcd_recursive(a, b) # 1.12μs -> 916ns (22.8% faster)

def test_gcd_large_numbers_with_one_zero():
    # Large number and zero
    a = 10**100
    codeflash_output = gcd_recursive(a, 0) # 334ns -> 417ns (19.9% slower)
    codeflash_output = gcd_recursive(0, a) # 583ns -> 500ns (16.6% faster)

# 4. Additional Edge Cases

def test_gcd_small_and_large_number():
    # Small and very large number
    codeflash_output = gcd_recursive(1, 10**18) # 917ns -> 792ns (15.8% faster)
    codeflash_output = gcd_recursive(10**18, 1) # 458ns -> 375ns (22.1% faster)

def test_gcd_negative_and_positive_coprime():
    # Negative and positive coprime numbers
    codeflash_output = gcd_recursive(-7, 11) # 1.04μs -> 833ns (25.0% faster)

def test_gcd_negative_and_positive_common_factor():
    # Negative and positive with common factor
    codeflash_output = gcd_recursive(-24, 36) # 792ns -> 667ns (18.7% faster)

def test_gcd_zero_and_zero_again():
    # Confirm GCD(0, 0) is 0
    codeflash_output = gcd_recursive(0, 0) # 375ns -> 416ns (9.86% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from src.math.computation import gcd_recursive

# unit tests

# ------------------------
# Basic Test Cases
# ------------------------

def test_gcd_basic_coprime():
    # Coprime numbers (should return 1)
    codeflash_output = gcd_recursive(13, 17) # 917ns -> 792ns (15.8% faster)
    codeflash_output = gcd_recursive(101, 10) # 417ns -> 333ns (25.2% faster)

def test_gcd_basic_common_divisor():
    # Numbers with a common divisor
    codeflash_output = gcd_recursive(12, 8) # 667ns -> 584ns (14.2% faster)
    codeflash_output = gcd_recursive(100, 25) # 250ns -> 292ns (14.4% slower)
    codeflash_output = gcd_recursive(36, 48) # 459ns -> 375ns (22.4% faster)

def test_gcd_basic_equal_numbers():
    # Both numbers are equal (should return the number itself)
    codeflash_output = gcd_recursive(7, 7) # 541ns -> 500ns (8.20% faster)
    codeflash_output = gcd_recursive(12345, 12345) # 541ns -> 458ns (18.1% faster)

def test_gcd_basic_one_zero():
    # One argument is zero (should return the absolute value of the other)
    codeflash_output = gcd_recursive(0, 5) # 666ns -> 542ns (22.9% faster)
    codeflash_output = gcd_recursive(9, 0) # 250ns -> 209ns (19.6% faster)

def test_gcd_basic_one_one():
    # Both arguments are one (should return 1)
    codeflash_output = gcd_recursive(1, 1) # 541ns -> 500ns (8.20% faster)

def test_gcd_basic_negative_numbers():
    # Negative inputs (GCD should always be non-negative)
    codeflash_output = gcd_recursive(-12, 8) # 750ns -> 667ns (12.4% faster)
    codeflash_output = gcd_recursive(12, -8) # 666ns -> 584ns (14.0% faster)
    codeflash_output = gcd_recursive(-12, -8) # 500ns -> 417ns (19.9% faster)

def test_gcd_basic_one_negative_one_zero():
    # One negative, one zero
    codeflash_output = gcd_recursive(-5, 0) # 333ns -> 375ns (11.2% slower)
    codeflash_output = gcd_recursive(0, -5) # 500ns -> 458ns (9.17% faster)

# ------------------------
# Edge Test Cases
# ------------------------

def test_gcd_edge_zero_zero():
    # Both arguments are zero (mathematically undefined, but function should return 0)
    codeflash_output = gcd_recursive(0, 0) # 375ns -> 375ns (0.000% faster)

def test_gcd_edge_large_prime_and_one():
    # Large prime number and 1 (should return 1)
    codeflash_output = gcd_recursive(982451653, 1) # 584ns -> 583ns (0.172% faster)

def test_gcd_edge_negative_and_positive_coprime():
    # Negative and positive coprime numbers
    codeflash_output = gcd_recursive(-17, 13) # 875ns -> 833ns (5.04% faster)
    codeflash_output = gcd_recursive(17, -13) # 917ns -> 708ns (29.5% faster)

def test_gcd_edge_negative_and_positive_common_divisor():
    # Negative and positive numbers with common divisor
    codeflash_output = gcd_recursive(-24, 18) # 792ns -> 750ns (5.60% faster)
    codeflash_output = gcd_recursive(24, -18) # 875ns -> 708ns (23.6% faster)

def test_gcd_edge_large_negative_numbers():
    # Large negative numbers
    codeflash_output = gcd_recursive(-123456789, -987654321) # 1.00μs -> 916ns (9.17% faster)

def test_gcd_edge_one_is_multiple_of_other():
    # One number is a multiple of the other
    codeflash_output = gcd_recursive(100, 10) # 584ns -> 541ns (7.95% faster)
    codeflash_output = gcd_recursive(10, 100) # 417ns -> 417ns (0.000% faster)

def test_gcd_edge_one_is_zero_other_negative():
    # One is zero, other is negative
    codeflash_output = gcd_recursive(0, -42) # 708ns -> 666ns (6.31% faster)
    codeflash_output = gcd_recursive(-42, 0) # 250ns -> 250ns (0.000% faster)

def test_gcd_edge_max_int_values():
    # Test with maximum 32-bit integer values
    max_int = 2**31 - 1
    codeflash_output = gcd_recursive(max_int, max_int) # 1.04μs -> 1.00μs (4.20% faster)
    codeflash_output = gcd_recursive(max_int, 0) # 291ns -> 250ns (16.4% faster)
    codeflash_output = gcd_recursive(0, max_int) # 500ns -> 375ns (33.3% faster)

def test_gcd_edge_min_int_values():
    # Test with minimum 32-bit integer values
    min_int = -2**31
    codeflash_output = gcd_recursive(min_int, min_int) # 917ns -> 833ns (10.1% faster)
    codeflash_output = gcd_recursive(min_int, 0) # 250ns -> 250ns (0.000% faster)
    codeflash_output = gcd_recursive(0, min_int) # 416ns -> 291ns (43.0% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_gcd_large_numbers():
    # Very large numbers (should be efficient and correct)
    a = 12345678901234567890
    b = 98765432109876543210
    # GCD is 900000000090
    codeflash_output = gcd_recursive(a, b) # 1.42μs -> 1.42μs (0.000% faster)

def test_gcd_large_numbers_coprime():
    # Very large coprime numbers (should return 1)
    a = 9999999967  # prime
    b = 9999999961  # prime
    codeflash_output = gcd_recursive(a, b) # 1.21μs -> 1.21μs (0.083% slower)

def test_gcd_large_numbers_multiple():
    # Large numbers where one is a multiple of the other
    a = 10**18
    b = 10**9
    codeflash_output = gcd_recursive(a, b) # 666ns -> 583ns (14.2% faster)

def test_gcd_many_recursive_steps():
    # Numbers that require many recursive steps (but <1000)
    # Fibonacci numbers are coprime, and GCD(a, b) for consecutive Fibonacci numbers is 1
    fib_30 = 832040
    fib_29 = 514229
    codeflash_output = gcd_recursive(fib_30, fib_29) # 3.58μs -> 3.08μs (16.2% faster)

def test_gcd_large_negative_and_positive():
    # Large negative and positive numbers
    a = -10**18
    b = 10**9
    codeflash_output = gcd_recursive(a, b) # 625ns -> 625ns (0.000% faster)

def test_gcd_large_prime_and_large_composite():
    # Large prime and large composite with known GCD
    a = 982451653  # prime
    b = 57885161 * 982451653  # composite, divisible by a
    codeflash_output = gcd_recursive(a, b) # 791ns -> 792ns (0.126% slower)

# ------------------------
# Additional Robustness Cases
# ------------------------

@pytest.mark.parametrize("a,b,expected", [
    (0, 0, 0),           # both zero
    (0, 1, 1),           # one zero
    (1, 0, 1),           # one zero
    (1, 1, 1),           # both one
    (2, 4, 2),           # even numbers
    (3, 9, 3),           # multiples
    (7, 13, 1),          # coprime
    (-7, 13, 1),         # negative coprime
    (7, -13, 1),         # negative coprime
    (-7, -13, 1),        # both negative coprime
    (24, -18, 6),        # negative common divisor
    (-24, 18, 6),        # negative common divisor
    (123456, 789012, 12),# large numbers
    (999983, 999979, 1), # large coprime primes
])
def test_gcd_parametrized(a, b, expected):
    # Parametrized test for diverse scenarios
    codeflash_output = gcd_recursive(a, b) # 12.9μs -> 11.9μs (8.44% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from src.math.computation import gcd_recursive

def test_gcd_recursive():
    gcd_recursive(0, 0)

To edit these changes git checkout codeflash/optimize-gcd_recursive-mh1oskej and push.

Codeflash

The optimization converts a recursive implementation to an iterative one by replacing the recursive call with a `while` loop. The key change is eliminating the function call overhead that occurs with each recursive step.

**What was changed:**
- Replaced `if b == 0: return a` and `return gcd_recursive(b, a % b)` with `while b != 0: a, b = b, a % b`
- Moved the final `return a` outside the loop

**Why it's faster:**
- **Eliminates function call overhead**: Each recursive call creates a new stack frame, which involves parameter passing, stack allocation, and return handling. The iterative version processes all steps in a single function call.
- **Reduces memory pressure**: Recursion builds up stack frames that consume memory, while iteration uses constant space with just two variables.
- **Better CPU cache utilization**: Staying within the same function keeps code and data more cache-friendly.

**Performance characteristics from tests:**
- Most effective on cases requiring multiple algorithm steps (e.g., Fibonacci numbers showing 16.2% speedup)
- Consistent 5-25% improvements across most test cases
- Best gains on coprime numbers and cases with many reduction steps
- Minimal impact on trivial cases (zero inputs, identical numbers) since they require fewer iterations

The 9% overall speedup comes from eliminating Python's function call mechanism while preserving the exact same mathematical algorithm.
@codeflash-ai codeflash-ai bot requested a review from KRRT7 October 22, 2025 07:43
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants