Skip to content

Conversation

misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented Aug 28, 2025

PR Type

Enhancement


Description

  • Introduce pytest_split for test-file splitting

  • Detect pytest runs and split tests by CPU count

  • Spawn parallel tracing subprocesses per split

  • Aggregate multiple replay test paths and replay


Diagram Walkthrough

flowchart LR
  A["tracer main"] -- "detect pytest run" --> B["pytest_split"]
  B -- "multiple splits" --> C["spawn subprocesses"]
  C -- "wait & load pickles" --> D["collect replay paths"]
  D -- "invoke codeflash replay" --> E["optimizer.run"]
  A -- "non-pytest or single" --> F["single-process trace"]
Loading

File Walkthrough

Relevant files
Enhancement
tracer.py
Add parallel pytest tracing logic                                               

codeflash/tracer.py

  • Import pytest_split helper function
  • Refactor args_dict setup before subprocess calls
  • Detect pytest invocation and split test paths
  • Launch multiple or single subprocesses based on splits
  • Collect and combine multiple replay_test_paths
  • Update sys.argv to include all replay test files
+91/-39 
pytest_parallelization.py
Implement pytest test splitting logic                                       

codeflash/tracing/pytest_parallelization.py

  • Add pytest_split to parse pytest args and paths
  • Recursively gather test_*.py files
  • Calculate splits based on CPU count and file count
  • Distribute test files into balanced groups
+81/-0   

Signed-off-by: Saurabh Misra <[email protected]>
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Missing Flag Update

The flag added_paths is never set to True after extending updated_sys_argv, causing repeated insertion of the same test splits for each matching element.

added_paths = False
updated_sys_argv = []
for elem in sys.argv:
    if elem in test_paths_set:
        if not added_paths:
            updated_sys_argv.extend(test_split)
Nil Split Handling

There is no check for pytest_split returning (None, None) before calling len(pytest_splits), which could raise a TypeError and crash the process.

if parsed_args.module and unknown_args[0] == "pytest":
    pytest_splits, test_paths = pytest_split(unknown_args[1:])
    print(pytest_splits)

if len(pytest_splits) > 1:
Invalid Parser Usage

Using pytest.Parser() assumes a public API that doesn’t exist in pytest, leading to an AttributeError. Consider using pytest’s recommended CLI argument parsing approach.

import pytest

parser = pytest.Parser()

pytest_args = parser.parse_known_args(arguments)

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent duplicate test arguments

After extending updated_sys_argv with a split, set added_paths = True so you don’t
repeatedly inject the same tests on subsequent matches. This prevents duplicated
test arguments in each subprocess call.

codeflash/tracer.py [119-126]

 added_paths = False
 updated_sys_argv = []
 for elem in sys.argv:
-    if elem in test_paths_set:
-        if not added_paths:
-            updated_sys_argv.extend(test_split)
+    if elem in test_paths_set and not added_paths:
+        updated_sys_argv.extend(test_split)
+        added_paths = True
     else:
         updated_sys_argv.append(elem)
Suggestion importance[1-10]: 8

__

Why: Setting added_paths = True prevents repeatedly injecting test_split into updated_sys_argv, fixing duplicate test arguments in subprocess calls.

Medium
General
Rebuild pytest args from unknown_args

Instead of iterating over the full sys.argv, rebuild the pytest invocation from
unknown_args to ensure only the intended test paths and flags are included. This
simplifies argument handling and avoids mixing tracer flags with pytest flags.

codeflash/tracer.py [121-126]

-for elem in sys.argv:
-    if elem in test_paths_set:
-        if not added_paths:
-            updated_sys_argv.extend(test_split)
-    else:
-        updated_sys_argv.append(elem)
+# Rebuild pytest command with split tests
+base_pytest = unknown_args[0]
+other_args = [arg for arg in unknown_args[1:] if arg not in test_paths_set]
+updated_sys_argv = [base_pytest, *other_args, *test_split]
Suggestion importance[1-10]: 4

__

Why: Rebuilding from unknown_args could simplify argument handling, but it risks dropping tracer-specific flags and may not cover all edge cases.

Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants