Skip to content

Conversation

@KRRT7
Copy link
Contributor

@KRRT7 KRRT7 commented May 19, 2025

PR Type

Enhancement


Description

• Add pre-commit config and CI linting setup
• Standardize formatting, imports, and line wrapping
• Introduce type hints and refine function signatures
• Simplify conditionals and remove redundancies


Changes walkthrough 📝

Relevant files
Formatting
3 files
tabulate.py
Clean up formatting and import ordering                                   
+114/-246
models.py
Reformat dataclass fields and add trailing commas               
+36/-20 
git_utils.py
Add noqa for lint exceptions                                                         
+2/-2     
Enhancement
12 files
pickle_patcher.py
Add type hints and simplify exception handling                     
+74/-64 
line_profile_utils.py
Introduce TYPE_CHECKING and type annotations                         
+59/-61 
code_context_extractor.py
Remove unused imports, add TYPE_CHECKING                                 
+38/-32 
code_extractor.py
Add return type hints and simplify code                                   
+35/-31 
cmd_init.py
Add noqa directives and fix typing casts                                 
+17/-16 
plugin.py
Add type hints to pytest hook signatures                                 
+29/-28 
functions_to_optimize.py
Simplify assertion and iterate diff mappings                         
+20/-17 
parse_line_profile_test_output.py
Add annotations and unify string quoting                                 
+55/-52 
replay_test.py
Clean up parameters and dict append formatting                     
+36/-29 
instrument_codeflash_trace.py
Add type hints and simplify decorator logic                           
+32/-40 
function_optimizer.py
Add noqa and refactor helper loops                                             
+9/-10   
aiservice.py
Reorder imports and add noqa for D417                                       
+7/-9     
Configuration changes
2 files
.pre-commit-config.yaml
Create pre-commit configuration file                                         
+7/-0     
pre-commit.yaml
Add GitHub pre-commit workflow                                                     
+19/-0   
Additional files
43 files
cfapi.py +5/-1     
codeflash_trace.py +46/-13 
trace_benchmarks.py +4/-4     
utils.py +21/-27 
cli.py +9/-3     
cli_common.py +2/-2     
console.py +5/-24   
logging_config.py +2/-2     
checkpoint.py +13/-9   
code_replacer.py +4/-4     
config_parser.py +15/-15 
env_utils.py +2/-0     
github_utils.py +6/-2     
shell_utils.py +12/-8   
time_utils.py +5/-5     
unused_definition_remover.py +15/-9   
discover_unit_tests.py +4/-3     
pytest_new_process_discovery.py +4/-3     
PrComment.py +6/-6     
main.py +3/-1     
ExperimentMetadata.py +2/-0     
function_context.py +13/-9   
optimizer.py +12/-5   
pickle_placeholder.py +18/-17 
create_pr.py +2/-2     
explanation.py +10/-10 
posthog_cf.py +4/-4     
sentry.py +1/-1     
tracer.py +8/-8     
profile_stats.py +2/-2     
replay_test.py +12/-5   
tracing_utils.py +2/-0     
update_license_version.py +3/-3     
codeflash_capture.py +8/-5     
comparator.py +1/-1     
concolic_testing.py +8/-5     
equivalence.py +6/-3     
instrument_codeflash_capture.py +17/-4   
parse_test_output.py +3/-3     
pytest_plugin.py +4/-6     
test_runner.py +4/-4     
verifier.py +1/-2     
pyproject.toml +12/-2   

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @github-actions github-actions bot added workflow-modified This PR modifies GitHub Actions workflows Review effort 2/5 labels May 19, 2025
    @github-actions
    Copy link

    github-actions bot commented May 19, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit f513763)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
    🧪 No relevant tests
    🔒 Security concerns

    Deserialization vulnerability:
    the PR replaces the safe pickle API with direct calls to dill.loads in PicklePatcher.loads, allowing arbitrary code execution if untrusted data is unpickled. Consider validating or sandboxing input before unpickling.

    ⚡ Recommended focus areas for review

    Type Inference

    The new numeric detection and classification logic (e.g. _float_with_thousands_separators, _isnumber_with_thousands_separator, _type) may misclassify or incorrectly parse strings with thousands separators, decimals, or scientific notation. Edge cases should be covered by tests to ensure correct behavior.

    _float_with_thousands_separators = re.compile(r"^(([+-]?[0-9]{1,3})(?:,([0-9]{3}))*)?(?(1)\.[0-9]*|\.[0-9]+)?$")
    
    
    def _isnumber_with_thousands_separator(string):
        try:
            string = string.decode()
        except (UnicodeDecodeError, AttributeError):
            pass
    Performance Concern

    The recursive pickling in _recursive_pickle frequently calls dill.dumps and placeholder creation for deep or large objects, which could incur significant overhead. Benchmark or optimize this path for large nested structures.

    @staticmethod
    def _recursive_pickle(  # noqa: PLR0911
        obj: object,
        max_depth: int,
        path: list[str] | None = None,
        protocol: int | None = None,
        **kwargs,  # noqa: ANN003
    ) -> bytes:
        """Recursively try to pickle an object, replacing unpicklable parts with placeholders.

    @github-actions
    Copy link

    github-actions bot commented May 19, 2025

    PR Code Suggestions ✨

    Latest suggestions up to f513763
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Return wrapped function result

    **Ensure the decorator preserves and returns the wrapped function's result. Update
    wrapper to call and return wrapped(*args, kwargs) instead of always returning
    None.

    codeflash/verification/codeflash_capture.py [90]

    -def wrapper(*args, **kwargs) -> None:  # noqa: ANN002, ANN003
    +def wrapper(*args, **kwargs):
    +    result = wrapped(*args, **kwargs)
    +    # existing capture logic...
    +    return result
    Suggestion importance[1-10]: 10

    __

    Why: Always returning None in the decorator drops the wrapped function's return value, breaking its contract and likely causing runtime errors.

    High
    Benchmark inside wrapper only

    Integrate the benchmark call inside wrapped_func so the function executes only once
    and its return value is captured correctly. Remove the unconditional pre-call to
    avoid double execution under the decorator.

    codeflash/benchmarking/plugin/plugin.py [242-253]

     def __call__(self, func, *args, **kwargs):  # type: ignore  # noqa: ANN001, ANN002, ANN003, ANN204, PGH003
         """Handle both direct function calls and decorator usage."""
         if args or kwargs:
             # Used as benchmark(func, *args, **kwargs)
             return self._run_benchmark(func, *args, **kwargs)
         # Used as @benchmark decorator
    -    def wrapped_func(*args, **kwargs):  # noqa: ANN002, ANN003, ANN202
    -        return func(*args, **kwargs)
    -    self._run_benchmark(func)
    +    def wrapped_func(*args, **kwargs):
    +        return self._run_benchmark(func, *args, **kwargs)
         return wrapped_func
    Suggestion importance[1-10]: 9

    __

    Why: The suggestion fixes a critical double execution bug by moving the benchmark call into the wrapper to ensure correct single invocation.

    High
    Define placeholder on value pickle failure

    Ensure that value_result is always defined even when pickling fails. Add an else
    branch to handle the case where value_success is False and assign a placeholder to
    avoid an UnboundLocalError.

    codeflash/picklepatch/pickle_patcher.py [221-228]

     value_path = [*path, f"[{repr(key)[:20]}]"]
     value_success, value_bytes = PicklePatcher._pickle(value, value_path, protocol, **kwargs)
     
     if value_success:
         try:
             value_result = dill.loads(value_bytes)
         except Exception as inner_e:
             value_result = PicklePatcher._create_placeholder(value, str(inner_e), value_path)
    +else:
    +    value_result = PicklePatcher._create_placeholder(value, error_msg, value_path)
     result[key_result] = value_result
    Suggestion importance[1-10]: 8

    __

    Why: Without an else branch when value_success is False, value_result can be undefined and raise an UnboundLocalError; adding a placeholder fallback fixes this critical bug.

    Medium
    Append placeholder on item pickle failure

    Guarantee that every sequence element is added to result, even when both pickle and
    recursive pickle fail. Add an else to append a placeholder when success remains
    False.

    codeflash/picklepatch/pickle_patcher.py [264-279]

     for i, item in enumerate(obj_seq):
         item_path = [*path, f"[{i}]"]
         success, item_bytes = PicklePatcher._pickle(item, item_path, protocol, **kwargs)
         if not success:
             success, item_bytes = PicklePatcher._recursive_pickle(item, max_depth - 1, item_path, protocol, **kwargs)
         if success:
             try:
                 result.append(dill.loads(item_bytes))
             except Exception as inner_e:
                 placeholder = PicklePatcher._create_placeholder(item, str(inner_e), item_path)
                 result.append(placeholder)
    +    else:
    +        placeholder = PicklePatcher._create_placeholder(item, error_msg, item_path)
    +        result.append(placeholder)
    Suggestion importance[1-10]: 7

    __

    Why: In _handle_sequence, failing both pickle attempts drops the item silently; appending a placeholder in the else ensures every element is represented and avoids data loss.

    Medium
    Guard against uninitialized PostHog

    Prevent a NoneType error when PostHog is not initialized. Check that _posthog is set
    before calling capture, or log/debug when it's uninitialized.

    codeflash/telemetry/posthog_cf.py [43-44]

    -if user_id:
    +if user_id and _posthog:
         _posthog.capture(distinct_id=user_id, event=event, properties=properties)  # type: ignore  # noqa: PGH003
    +else:
    +    logger.debug("PostHog is not initialized; event not sent.")
    Suggestion importance[1-10]: 7

    __

    Why: Adding a check for _posthog prevents a potential NoneType exception when capture is called before initialization.

    Medium
    Security
    Load API key from environment

    Avoid hardcoding sensitive API keys. Load the PostHog project API key from an
    environment variable instead of embedding it in the source.

    codeflash/telemetry/posthog_cf.py [24]

    -_posthog = Posthog(project_api_key="phc_aUO790jHd7z1SXwsYCz8dRApxueplZlZWeDSpKc5hol", host="https://us.posthog.com")  # type: ignore  # noqa: PGH003
    +_posthog = Posthog(
    +    project_api_key=os.getenv("POSTHOG_API_KEY"),
    +    host="https://us.posthog.com"
    +)  # type: ignore  # noqa: PGH003
    Suggestion importance[1-10]: 9

    __

    Why: Hardcoding a PostHog API key is a serious security risk; loading it from an environment variable prevents credential leakage.

    High
    General
    Use correct cast type

    Use the actual StringIO type in the cast call instead of a string literal to satisfy
    type checkers and avoid confusion.

    codeflash/result/explanation.py [86]

    -benchmark_info = cast("StringIO", console.file).getvalue() + "\n"  # Cast for mypy
    +benchmark_info = cast(StringIO, console.file).getvalue() + "\n"  # Cast for mypy
    Suggestion importance[1-10]: 5

    __

    Why: Using the real StringIO type in cast improves clarity and satisfies type checkers but has minor impact on functionality.

    Low

    Previous suggestions

    Suggestions up to commit 0a47165
    CategorySuggestion                                                                                                                                    Impact
    General
    Handle cleanup errors gracefully

    Wrap the cleanup call in a try/except to prevent any cleanup failure from
    terminating the optimizer. Log a warning if cleanup_paths raises to aid debugging
    without interrupting normal flow.

    codeflash/optimization/optimizer.py [269-270]

    -if self.test_cfg.concolic_test_root_dir:
    -    cleanup_paths([self.test_cfg.concolic_test_root_dir])
    +try:
    +    if self.test_cfg.concolic_test_root_dir:
    +        cleanup_paths([self.test_cfg.concolic_test_root_dir])
    +except Exception as cleanup_error:
    +    logger.warning(f"Failed to clean concolic directory: {cleanup_error}")
    Suggestion importance[1-10]: 6

    __

    Why: The cleanup call could raise an exception and abort the optimizer; wrapping it in try/except ensures failures are logged and do not interrupt the main flow.

    Low
    Provide explicit exit code

    Use an explicit non-zero exit code by raising SystemExit(1) so downstream tooling
    can detect that the process ended due to an interrupt. This replaces the bare
    SystemExit for clearer semantics.

    codeflash/optimization/optimizer.py [277-279]

     except KeyboardInterrupt:
         logger.warning("Keyboard interrupt received. Exiting, please wait…")
    -    raise SystemExit from None
    +    raise SystemExit(1)
    Suggestion importance[1-10]: 5

    __

    Why: Using SystemExit(1) clearly indicates an error exit to downstream tools, improving clarity without altering functionality.

    Low

    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    … by 111% in PR #217 (`proper-cleanup`)
    
    Here's a rewritten, optimized version of your program, focusing on what the line profile indicates are bottlenecks.
    
    - **Reuse cursor**: Opening a new cursor repeatedly is slow. Maintain a persistent cursor.
    - **Batching commits**: Commit after many inserts if possible. However, since you clear the buffer after each write, one commit per call is necessary.
    - **Pragma optimizations**: Set SQLite pragmas (`synchronous = OFF`, `journal_mode = MEMORY`) for faster inserts if durability isn't paramount.
    - **Avoid excessive object recreation**: Only connect if needed, and clear but *do not reallocate* the benchmark list.
    - **Reduce exception handling cost**: Trap and re-raise only actual DB exceptions.
    
    **Note:** For highest speed, `executemany` and single-transaction-batch inserts are already optimal for SQLite. If even faster, use `bulk insert` with `INSERT INTO ... VALUES (...), (...), ...`, but this requires constructing SQL dynamically.
    
    Here’s the optimized version.
    
    
    
    **Key points:**
    - `self._ensure_connection()` ensures both persistent connection and cursor.
    - Pragmas are set only once for connection.
    - Use `self.benchmark_timings.clear()` to avoid list reallocation.
    - The cursor is reused for the lifetime of the object.
    
    **If your stability requirements are stricter** (durability required), remove or tune the PRAGMA statements. If you want even higher throughput and can collect many queries per transaction, consider accepting a "bulk flush" mode to reduce commit frequency, but this requires API change.
    
    This code preserves your public API and all comments, while running considerably faster especially on large inserts.
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 111% (1.11x) speedup for CodeFlashBenchmarkPlugin.write_benchmark_timings in codeflash/benchmarking/plugin/plugin.py

    ⏱️ Runtime : 26.9 milliseconds 12.8 milliseconds (best of 121 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    …proper-cleanup`)
    
    Here is a faster rewrite of your function. The main optimizations.
    
    - Replace `{}` set literal with tuple `()` for membership check, as a tuple is faster for small constant sets and avoids dynamic hashing.
    - Use string multiplication only as needed.
    - Use guarded string concatenation to minimize interpretation overhead.
    - Match aligns in order of likelihood (generally "left" or "right" is more common than "center" or "decimal", adjust if your usage is different).
    - Consolidate conditions to reduce branching where possible.
    
    
    
    
    - This version uses direct comparison for the common cases and avoids the overhead of set/tuple lookup.
    - The order of conditions can be adjusted depending on which alignment is most frequent in your workload for optimal branch prediction.
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 11% (0.11x) speedup for _pipe_segment_with_colons in codeflash/code_utils/tabulate.py

    ⏱️ Runtime : 118 microseconds 106 microseconds (best of 325 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    …per-cleanup`)
    
    Here are the main performance issues and solutions for your program.
    
    ### Profile Insights
    
    - The function **`_pipe_segment_with_colons`** is hit many times, and most time is spent creating new strings with expressions like `'-' * n` and concatenation.
    - In **`_pipe_line_with_colons`**, almost all runtime is spent in the list comprehension calling `_pipe_segment_with_colons`.
    - There are repeated lookups/checks for the alignment, which can be made faster by using a dictionary for dispatch.
    - The repeated string multiplication and concatenation in `_pipe_segment_with_colons` can be accelerated for common values (like when width is small or common) via caching.
    
    ### Optimizations
    
    1. **Function dispatch via dictionary** to avoid sequential `if`-`elif`.
    2. **Cache small, frequently repeated templates** in `_pipe_segment_with_colons` using `functools.lru_cache` (for acceleration when the same alignment and width is requested over and over).
    3. **Pre-localize frequently used builtins** (like `str.join`, `str.__mul__`).
    4. **Minor improvement**: Reduce `str` concatenations.
    
    Here's the optimized code.
    
    
    
    ---
    
    ### Why this version is faster
    
    1. **lru_cache** on `_pipe_segment_with_colons` to memoize results (Python will keep the last few most requested line segments in RAM). This is effective since your profile shows thousands of hits with the same arguments.
    2. **Reduced branching** inside inner loop via `elif` for clarity.
    3. **Localizing built-in function** lookups improves performance (as calling a local variable is faster than attribute/property lookup on objects).
    
    These changes together should provide **measurably improved runtime**—especially for repeated, table-wide invocations! If you expect very large tables or uncommon `(align, colwidth)` combinations, you can tune the cache size in `@lru_cache(maxsize=N)`. For typical markdown/pipe-aligned tables, this value is more than enough.
    
    ---
    **You may further accelerate with Cython or by using dedicated C-based formatters, but not within pure Python constraints.**
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 46% (0.46x) speedup for _pipe_line_with_colons in codeflash/code_utils/tabulate.py

    ⏱️ Runtime : 1.15 milliseconds 789 microseconds (best of 265 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    Here’s an optimized version of your function.
    
    
    
    **Optimization notes:**
    - `type(x) is y` is retained for speed.
    - `set` instantiation is avoided in favor of direct comparison for `"True"` and `"False"` when dealing with strings.
    - Check for `str` directly rather than combining with `bytes`, as `"True"` and `"False"` are not valid byte representations.
    - If checking `bytes` is strictly required (not functionally needed per the literal values in the set), let me know.
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 28% (0.28x) speedup for _isbool in codeflash/code_utils/tabulate.py

    ⏱️ Runtime : 41.2 microseconds 32.2 microseconds (best of 98 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    Here is an optimized version of `_strip_ansi` focusing on runtime speed. The main bottleneck is the regular expression replacement in `re.sub`, specifically returning the (possibly empty) group 4 (link text in hyperlinks) or, if it doesn’t match, an empty string for ANSI codes.  
    This can be significantly sped up by avoiding the costly `r"\4"` (which always triggers group resolution machinery in the regex engine), and instead using a faster replacer callback.  
    Since every match will be either an ANSI escape code (where group 4 is `None`) or a hyperlink (where group 4 contains the visible link text), we can handle both cases in one simple function.
    
    Optimized `code`.
    
    
    
    **Performance notes:**  
    - This avoids all regex group substitution machinery for the common ANSI case.
    - No change to visible/functional behavior.
    - No changes to external function names or signatures.
    - String and bytes cases are handled separately, so no unnecessary type checks inside tight loops.
    
    **Comment:**  
    No original comments were changed or removed.  
    No changes made to the public interface or expected output.  
    All logic concerning group 4 and escape sequence removal is preserved.
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 116% (1.16x) speedup for _strip_ansi in codeflash/code_utils/tabulate.py

    ⏱️ Runtime : 4.98 milliseconds 2.30 milliseconds (best of 164 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    @KRRT7 KRRT7 changed the title cleanup concolic dirs properly, add recommit cleanup concolic dirs properly, add precommit May 19, 2025
    @KRRT7 KRRT7 force-pushed the proper-cleanup branch from bc71618 to 191317b Compare May 19, 2025 18:28
    codeflash-ai bot added a commit that referenced this pull request May 19, 2025
    … by 148% in PR #217 (`proper-cleanup`)
    
    Below is an optimized version of your program with respect to the provided line profiler results.  
    The major bottleneck is `self._connection.commit()` and, to a lesser extent, `cur.executemany(...)`.  
    We can **greatly accelerate SQLite bulk inserts** and commits by.
    
    - Disabling SQLite's default autocommit mode and wrapping the bulk inserts in a single explicit transaction.
    - Using `with self._connection`, which ensures a transaction/commit automatically or using `begin`/`commit` explicitly.
    - Setting SQLite's `PRAGMA synchronous = OFF` and `PRAGMA journal_mode = MEMORY` if durability is not absolutely required, since this will make writes much faster (you may enable this once per connection only).
    
    **Note:**  
    – These changes keep the function return value and signature identical.  
    – Connection and PRAGMA are set only once per connection.  
    – All existing comments are preserved, and new comments only explain modifications.
    
    
    
    
    ### Why this is faster.
    - **Explicit transaction**: `self._connection.execute('BEGIN')` + one `commit()` is far faster than relying on SQLite's default behavior.
    - **PRAGMA tweaks**: `synchronous = OFF` and `journal_mode = MEMORY` massively reduce disk sync/write overhead for benchmark data.
    - **Batching**: Still using `executemany` for the most efficient bulk insert.
    - **Single cursor, closed immediately**.
    
    *If your use case absolutely requires durability against power loss, remove the two PRAGMA settings (or use `WAL` and `NORMAL` modes). This code retains the exact logic and return values, but will be considerably faster in typical benchmarking scenarios.*
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented May 19, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 148% (1.48x) speedup for CodeFlashBenchmarkPlugin.write_benchmark_timings in codeflash/benchmarking/plugin/plugin.py

    ⏱️ Runtime : 25.9 milliseconds 10.5 milliseconds (best of 91 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch proper-cleanup).

    @KRRT7 KRRT7 force-pushed the proper-cleanup branch 2 times, most recently from 43d1056 to f513763 Compare May 20, 2025 00:55
    @KRRT7 KRRT7 marked this pull request as ready for review May 20, 2025 01:24
    @github-actions
    Copy link

    Persistent review updated to latest commit f513763

    @KRRT7 KRRT7 force-pushed the proper-cleanup branch 2 times, most recently from 5a2b80b to 8128559 Compare May 20, 2025 05:25
    @KRRT7 KRRT7 force-pushed the proper-cleanup branch from 48716a1 to 0ba52ea Compare May 21, 2025 01:40
    @KRRT7 KRRT7 requested a review from misrasaurabh1 May 21, 2025 01:46
    @misrasaurabh1 misrasaurabh1 merged commit e19f3a2 into main May 21, 2025
    17 checks passed
    @KRRT7 KRRT7 deleted the proper-cleanup branch May 21, 2025 05:35
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    Review effort 5/5 workflow-modified This PR modifies GitHub Actions workflows

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants