Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Jul 4, 2025

⚡️ This pull request contains optimizations for PR #354

If you approve this dependent PR, these changes will be merged into the original PR branch chore/get-pr-number-from-gh-action-event-file.

This PR will be automatically closed if the original PR is merged.


📄 24% (0.24x) speedup for is_pr_draft in codeflash/code_utils/env_utils.py

⏱️ Runtime : 43.3 microseconds 35.0 microseconds (best of 117 runs)

📝 Explanation and details

Here is an optimized version of your program. The main hotspots in your code are.

  • Disk IO with reading/parsing the event file (unavoidable but can be slightly optimized).
  • Using Path(event_path).open() is slower than using open(event_path, ...).
  • @lru_cache introduces a bit of function call and hash overhead each time since it wraps your function. Since your maxsize is 1, and the data is constant in a GitHub Actions run, you can instead use a simple module-level cache variable with a sentinel value to avoid that overhead.
  • The use of lots of chained .get with nested dictionaries can be condensed slightly for speed.

Below is a rewritten version maintaining all external behavior (same function names and signatures, same return values).

Summary of optimizations:

  • Replaced @lru_cache with a lightweight module-level cache for get_cached_gh_event_data. Since the event file will not change during a single GH Actions run, this is safe and removes function call/lookup overhead.
  • Used plain open() instead of the slower Path(event_path).open().
  • Reduced nested .get(..., {}) lookups to a single step for faster logic.
  • Kept exception handling to prevent failure if the file is missing/corrupt.
  • No external behavior was changed: all function names/signatures/return values are identical.
  • Preserved all important comments as requested.

If you want even more performance and you know in your context that the event file always exists and is well-formed, you can strip out the try/except block. But the above version stays robust and is still much faster.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 3 Passed
🌀 Generated Regression Tests 34 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 85.7%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import json
import os
import tempfile
from functools import lru_cache
from pathlib import Path
from typing import Any

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.env_utils import is_pr_draft


def write_temp_event_file(event_data: dict) -> str:
    """Helper to write event data to a temp file and return its path."""
    tmp = tempfile.NamedTemporaryFile(mode="w", delete=False)
    json.dump(event_data, tmp)
    tmp.close()
    return tmp.name

##############################
# 1. Basic Test Cases
##############################

def test_pr_draft_true(monkeypatch):
    """Test: PR is a draft (draft: true)"""
    event = {"pull_request": {"draft": True}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.27μs -> 1.09μs (16.5% faster)
    os.unlink(path)

def test_pr_draft_false(monkeypatch):
    """Test: PR is not a draft (draft: false)"""
    event = {"pull_request": {"draft": False}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.27μs -> 1.06μs (19.8% faster)
    os.unlink(path)

def test_pr_draft_missing(monkeypatch):
    """Test: PR exists but 'draft' field is missing (should default to False)"""
    event = {"pull_request": {"foo": "bar"}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.28μs -> 1.05μs (21.9% faster)
    os.unlink(path)

def test_no_pull_request(monkeypatch):
    """Test: Event has no 'pull_request' field (should default to False)"""
    event = {"issue": {"number": 1}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.24μs -> 1.05μs (17.9% faster)
    os.unlink(path)

def test_empty_event(monkeypatch):
    """Test: Event file is empty dict (should default to False)"""
    event = {}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.25μs -> 1.04μs (20.2% faster)
    os.unlink(path)

##############################
# 2. Edge Test Cases
##############################

def test_github_event_path_not_set(monkeypatch):
    """Test: GITHUB_EVENT_PATH is not set (should default to False)"""
    # No env var set
    codeflash_output = is_pr_draft() # 1.04μs -> 842ns (23.8% faster)



def test_draft_field_is_none(monkeypatch):
    """Test: 'draft' field is None (should treat as False)"""
    event = {"pull_request": {"draft": None}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.35μs -> 1.11μs (21.5% faster)
    os.unlink(path)

def test_draft_field_is_string_true(monkeypatch):
    """Test: 'draft' field is string 'true' (should treat as True in bool context)"""
    event = {"pull_request": {"draft": "true"}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    # bool('true') is True, so this should return True
    codeflash_output = is_pr_draft() # 1.24μs -> 1.06μs (16.9% faster)
    os.unlink(path)

def test_draft_field_is_empty_string(monkeypatch):
    """Test: 'draft' field is empty string (should treat as False in bool context)"""
    event = {"pull_request": {"draft": ""}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    # bool('') is False
    codeflash_output = is_pr_draft() # 1.27μs -> 1.04μs (22.2% faster)
    os.unlink(path)

def test_draft_field_is_int_one(monkeypatch):
    """Test: 'draft' field is integer 1 (should treat as True in bool context)"""
    event = {"pull_request": {"draft": 1}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.20μs -> 1.05μs (14.4% faster)
    os.unlink(path)

def test_draft_field_is_int_zero(monkeypatch):
    """Test: 'draft' field is integer 0 (should treat as False in bool context)"""
    event = {"pull_request": {"draft": 0}}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.28μs -> 1.01μs (26.7% faster)
    os.unlink(path)



def test_large_event_file_with_many_keys(monkeypatch):
    """Test: Large event file with many unrelated keys, but valid pull_request.draft"""
    # Create a large dict with 999 unrelated keys
    event = {f"key{i}": i for i in range(999)}
    event["pull_request"] = {"draft": True}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.42μs -> 1.13μs (25.5% faster)
    os.unlink(path)

def test_large_pull_request_dict(monkeypatch):
    """Test: Large pull_request dict with many unrelated keys, but 'draft': False"""
    pull_request = {f"field{i}": i for i in range(999)}
    pull_request["draft"] = False
    event = {"pull_request": pull_request}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.34μs -> 1.09μs (23.1% faster)
    os.unlink(path)

def test_many_pull_request_like_keys(monkeypatch):
    """Test: Event file with many keys named like pull_request1, pull_request2, etc., only 'pull_request' matters"""
    event = {f"pull_request{i}": {"draft": True} for i in range(999)}
    event["pull_request"] = {"draft": False}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    # Only 'pull_request' key should be checked
    codeflash_output = is_pr_draft() # 1.37μs -> 1.06μs (29.3% faster)
    os.unlink(path)

def test_large_event_file_no_pull_request(monkeypatch):
    """Test: Large event file with no 'pull_request' key (should default to False)"""
    event = {f"key{i}": i for i in range(999)}
    path = write_temp_event_file(event)
    monkeypatch.setenv("GITHUB_EVENT_PATH", path)
    codeflash_output = is_pr_draft() # 1.33μs -> 1.06μs (25.5% faster)
    os.unlink(path)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from __future__ import annotations

import json
import os
import tempfile
from functools import lru_cache
from pathlib import Path
from typing import Any

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.env_utils import is_pr_draft


# Helper to write a temp event file and set env var
def write_event_file(data, monkeypatch):
    fd, path = tempfile.mkstemp(suffix=".json")
    try:
        with os.fdopen(fd, "w") as f:
            json.dump(data, f)
        monkeypatch.setenv("GITHUB_EVENT_PATH", path)
        return path
    except Exception:
        os.unlink(path)
        raise

# --- Basic Test Cases ---

def test_pr_draft_true(monkeypatch):
    # PR is draft: True
    event = {"pull_request": {"draft": True}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.23μs -> 1.00μs (23.0% faster)
    os.unlink(path)

def test_pr_draft_false(monkeypatch):
    # PR is draft: False
    event = {"pull_request": {"draft": False}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.21μs -> 1.00μs (21.0% faster)
    os.unlink(path)

def test_pr_draft_missing(monkeypatch):
    # PR exists, but draft key is missing (should default to False)
    event = {"pull_request": {"title": "Some PR"}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.23μs -> 972ns (26.9% faster)
    os.unlink(path)

def test_not_a_pr_event(monkeypatch):
    # No pull_request key in event (should default to False)
    event = {"something_else": 123}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.23μs -> 1.00μs (23.0% faster)
    os.unlink(path)

# --- Edge Test Cases ---

def test_no_github_event_path_env(monkeypatch):
    # GITHUB_EVENT_PATH is not set (should default to False)
    codeflash_output = is_pr_draft() # 1.02μs -> 862ns (18.6% faster)


def test_pull_request_is_none(monkeypatch):
    # pull_request key exists but is None (should treat as empty dict)
    event = {"pull_request": None}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.39μs -> 1.11μs (25.3% faster)
    os.unlink(path)

def test_draft_value_is_none(monkeypatch):
    # draft key is None (should treat as False)
    event = {"pull_request": {"draft": None}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.31μs -> 992ns (32.4% faster)
    os.unlink(path)

def test_draft_value_is_string_true(monkeypatch):
    # draft key is string "true" (should treat as True because bool("true") == True)
    event = {"pull_request": {"draft": "true"}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.23μs -> 1.00μs (23.0% faster)
    os.unlink(path)

def test_draft_value_is_string_false(monkeypatch):
    # draft key is string "false" (should treat as True because bool("false") == True)
    event = {"pull_request": {"draft": "false"}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.27μs -> 982ns (29.5% faster)
    os.unlink(path)

def test_draft_value_is_empty_string(monkeypatch):
    # draft key is empty string (should treat as False)
    event = {"pull_request": {"draft": ""}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.26μs -> 1.01μs (24.8% faster)
    os.unlink(path)

def test_draft_value_is_zero(monkeypatch):
    # draft key is 0 (should treat as False)
    event = {"pull_request": {"draft": 0}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.28μs -> 982ns (30.7% faster)
    os.unlink(path)

def test_draft_value_is_nonzero(monkeypatch):
    # draft key is 1 (should treat as True)
    event = {"pull_request": {"draft": 1}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.23μs -> 962ns (28.1% faster)
    os.unlink(path)

def test_pull_request_is_empty_dict(monkeypatch):
    # pull_request is an empty dict (should default to False)
    event = {"pull_request": {}}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.26μs -> 991ns (27.3% faster)
    os.unlink(path)

def test_event_file_is_empty_json(monkeypatch):
    # Event file is an empty JSON object (should default to False)
    event = {}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.26μs -> 1.01μs (24.8% faster)
    os.unlink(path)




def test_large_event_file_with_many_keys(monkeypatch):
    # Event file with many unrelated keys, but correct pull_request.draft present
    event = {f"key_{i}": i for i in range(500)}
    event["pull_request"] = {"draft": True}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.43μs -> 1.13μs (26.6% faster)
    os.unlink(path)

def test_large_pull_request_object(monkeypatch):
    # pull_request object with many keys, draft is False
    pr_obj = {f"field_{i}": i for i in range(500)}
    pr_obj["draft"] = False
    event = {"pull_request": pr_obj}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.29μs -> 1.04μs (24.1% faster)
    os.unlink(path)

def test_many_events_in_file_but_only_one_used(monkeypatch):
    # Event file simulating a list of events, but only root-level pull_request should be used
    # This is a malformed event, but should test that only top-level is used
    event = {
        "pull_request": {"draft": True},
        "events": [{"pull_request": {"draft": False}} for _ in range(100)]
    }
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.31μs -> 1.06μs (23.5% faster)
    os.unlink(path)

def test_large_file_with_irrelevant_data(monkeypatch):
    # Large event file with lots of irrelevant data, and no pull_request key
    event = {f"noise_{i}": "x" * 100 for i in range(900)}
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.33μs -> 1.06μs (25.5% faster)
    os.unlink(path)

def test_large_file_with_pull_request_none(monkeypatch):
    # Large event file with pull_request=None
    event = {f"noise_{i}": "y" * 50 for i in range(900)}
    event["pull_request"] = None
    path = write_event_file(event, monkeypatch)
    codeflash_output = is_pr_draft() # 1.28μs -> 1.06μs (20.7% faster)
    os.unlink(path)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr354-2025-07-04T13.21.23 and push.

Codeflash

…umber-from-gh-action-event-file`)

Here is an optimized version of your program. The main hotspots in your code are.
- Disk IO with reading/parsing the event file (unavoidable but can be slightly optimized).
- Using `Path(event_path).open()` is slower than using `open(event_path, ...)`.
- `@lru_cache` introduces a bit of function call and hash overhead each time since it wraps your function. Since your maxsize is 1, and the data is constant in a GitHub Actions run, you can instead use a simple module-level cache variable with a sentinel value to avoid that overhead.
- The use of lots of chained `.get` with nested dictionaries can be condensed slightly for speed.

Below is a rewritten version maintaining all external behavior (same function names and signatures, same return values).



**Summary of optimizations:**
- Replaced `@lru_cache` with a lightweight module-level cache for `get_cached_gh_event_data`. Since the event file will not change during a single GH Actions run, this is safe and removes function call/lookup overhead.
- Used plain `open()` instead of the slower `Path(event_path).open()`.
- Reduced nested `.get(..., {})` lookups to a single step for faster logic.
- Kept exception handling to prevent failure if the file is missing/corrupt.
- No external behavior was changed: all function names/signatures/return values are identical.
- Preserved all important comments as requested.

If you want even more performance and you **know** in your context that the event file always exists and is well-formed, you can strip out the try/except block. But the above version stays robust and is still much faster.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 4, 2025
@codeflash-ai codeflash-ai bot closed this Jul 4, 2025
@codeflash-ai
Copy link
Contributor Author

codeflash-ai bot commented Jul 4, 2025

This PR has been automatically closed because the original PR #354 by mohammedahmed18 was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr354-2025-07-04T13.21.23 branch July 4, 2025 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants