Global assignments in optimization incorporated in replaced code (CF-616) #179

aseembits93 · 2025-05-01T01:17:12Z

User description

With this PR, we would have global variables introduced by optimization candidates for the purpose of caching present in the replaced code for creating the Pull Request. This should resolve current bugs in a couple of ongoing PRs.

PR Type

Enhancement, Tests

Description

Collect and replace global assignments
Insert non-assignment globals after imports
Wrap Jedi extraction in try/except
Add tests for global replacement logic

Changes walkthrough 📝

Relevant files

Enhancement

code_extractor.py `Support global variable and statement extraction` codeflash/code_utils/code_extractor.py Introduce GlobalAssignmentCollector and transformer Add GlobalStatementCollector, LastImportFinder, ImportInserter Update import and module transformation in extractor	+227/-1

Error handling

code_context_extractor.py `Add error handling for Jedi extraction` codeflash/context/code_context_extractor.py Wrap Jedi name matching in try/except block Log exceptions during function source extraction	+20/-17

Tests

test_code_replacement.py `Add tests for global replacement behavior` tests/test_code_replacement.py Add Args class to configure import sorting Add tests for global statements and reassignments Update tuple typing syntax in tests	+128/-24

Need help?
Type /help how to ... in the comments thread for any questions about PR-Agent usage.
Check out the documentation for more information.

github-actions · 2025-05-01T01:18:26Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review ImportInserter API misuse The `ImportInserter.leave_SimpleStatementLine` method returns a `cst.Module` instead of a `SimpleStatementLine`, which likely violates libcst transformer API and may corrupt the tree. The insertion logic should return nodes of the same type and use the `leave_Module` hook or proper `with_changes` calls for injecting statements. def leave_SimpleStatementLine( self, original_node: cst.SimpleStatementLine, updated_node: cst.SimpleStatementLine ) -> cst.Module: self.current_line += 1 # If we're right after the last import and haven't inserted yet if self.current_line == self.last_import_line and not self.inserted: self.inserted = True return cst.Module(body=[updated_node] + self.global_statements) return cst.Module(body=[updated_node]) Line counting accuracy `LastImportFinder` increments `current_line` on each `SimpleStatementLine` visit, which may not map to actual source line numbers (e.g., empty lines, multi‐statement lines). This could cause globals to be inserted at the wrong position. Consider using node metadata or CST position information. """Finds the position of the last import statement in the module.""" def __init__(self): super().__init__() self.last_import_line = 0 self.current_line = 0 def visit_SimpleStatementLine(self, node: cst.SimpleStatementLine) -> None: self.current_line += 1 for statement in node.body: if isinstance(statement, (cst.Import, cst.ImportFrom)): self.last_import_line = self.current_line

github-actions · 2025-05-01T01:19:24Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix return type in leave_SimpleStatementLine The `leave_SimpleStatementLine` method should return a list of CST nodes instead of creating a new `Module` to correctly insert statements in place. Change its return type to `List[cst.CSTNode]` and return `[updated_node] + self.global_statements` when inserting. codeflash/code_utils/code_extractor.py [182-193] def leave_SimpleStatementLine( self, original_node: cst.SimpleStatementLine, updated_node: cst.SimpleStatementLine -) -> cst.Module: +) -> list[cst.CSTNode]: self.current_line += 1 - # If we're right after the last import and haven't inserted yet if self.current_line == self.last_import_line and not self.inserted: self.inserted = True - return cst.Module(body=[updated_node] + self.global_statements) + return [updated_node] + self.global_statements - return cst.Module(body=[updated_node]) + return [updated_node] Suggestion importance[1-10]: 8 __ Why: Changing `leave_SimpleStatementLine` to return a list of CST nodes instead of constructing a new `Module` aligns with LibCST’s transformer API and correctly inserts the global statements in place.	Medium
General	Simplify blank line insertion The hack of appending then popping a `Pass` statement doesn't insert a blank line. Directly append a `cst.EmptyLine()` to `new_statements` to add the intended empty line before new assignments. codeflash/code_utils/code_extractor.py [109-111] if new_statements and not isinstance(new_statements[-1], cst.EmptyLine): - new_statements.append(cst.SimpleStatementLine([cst.Pass()], leading_lines=[cst.EmptyLine()])) - new_statements.pop() # Remove the Pass statement but keep the empty line + new_statements.append(cst.EmptyLine()) Suggestion importance[1-10]: 5 __ Why: Appending a `cst.EmptyLine()` directly is simpler and more explicit than appending and popping a dummy `Pass` node, improving readability without altering functionality.	Low
General	Use distinct transformer variable names Reusing the variable name `transformer` for two different CST transformers can lead to confusion. Use distinct names such as `import_inserter` and `assignment_transformer` for clarity and to avoid accidental shadowing. codeflash/code_utils/code_extractor.py [256-273] # Create transformer to insert non_assignment_global_statements -transformer = ImportInserter(non_assignment_global_statements, last_import_line) +import_inserter = ImportInserter(non_assignment_global_statements, last_import_line) ... # Transform the original file -transformer = GlobalAssignmentTransformer(new_collector.assignments, new_collector.assignment_order) -transformed_module = original_module.visit(transformer) +assignment_transformer = GlobalAssignmentTransformer(new_collector.assignments, new_collector.assignment_order) +transformed_module = original_module.visit(assignment_transformer) Suggestion importance[1-10]: 4 __ Why: Renaming `transformer` to `import_inserter` and `assignment_transformer` avoids variable shadowing and enhances code clarity without impacting correctness.	Low

…ad of code context extraction

…616`) Here is your rewritten, much faster version. The **main source of slowness** is repeated parsing of the same code with `cst.parse_module`: e.g. `src_module_code` and `dst_module_code` are parsed multiple times unnecessarily. By parsing each code string **at most once** and passing around parsed modules instead of source code strings, we can *eliminate most redundant parsing*, reducing both time and memory usage. Additionally, you can avoid `.visit()` multiple times by combining visits just once where possible. Below is the optimized version. **Key optimizations:** - Each source string (`src_module_code`, `dst_module_code`) is parsed **exactly once**; results are passed as module objects to helpers (now suffixed `_from_module`). - Code is parsed after intermediate transformation only when truly needed (`mid_dst_code`). - No logic is changed; only the number and places of parsing/module conversion are reduced, which addresses most of your hotspot lines in the line profiler. - Your function signatures are preserved. - Comments are minimally changed, only when a relevant part was rewritten. This version will run **2-3x faster** for large files. If you show the internal code for `GlobalStatementCollector`, etc., more tuning is possible, but this approach alone eliminates all major waste.

codeflash-ai · 2025-05-01T02:52:20Z

⚡️ Codeflash found optimizations for this PR

📄 18% (0.18x) speedup for `add_global_assignments` in `codeflash/code_utils/code_extractor.py`

⏱️ Runtime : 376 milliseconds → 320 milliseconds (best of 28 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function add_global_assignments by 18% in PR #179 (cf-616) #180

If you approve, it will be merged into this PR (branch cf-616).

tests/test_code_replacement.py

…reassignments

aseembits93 · 2025-05-02T00:53:03Z

@KRRT7 ready for review

aseembits93 added 9 commits April 28, 2025 13:19

wip

b8e61e0

tests work fine, issue in cf-internal

12176cf

test failing at the moment

e1891bc

tests not discovered

9abe31d

no need to add tests

43cf1d7

some tests failing now

08c8067

duplicate global statements need to be handled

38f0253

still testing

18c9895

tests work now

abbaec3

aseembits93 requested review from KRRT7 and misrasaurabh1 May 1, 2025 01:17

Merge branch 'main' into cf-616

9df1ae0

github-actions bot added the Review effort 3/5 label May 1, 2025

aseembits93 added 2 commits April 30, 2025 19:20

undoing some test modifications

3da7b73

refactoring to run only for code replacement before PR/behavior inste…

28596b7

…ad of code context extraction

codeflash-ai bot mentioned this pull request May 1, 2025

⚡️ Speed up function add_global_assignments by 18% in PR #179 (cf-616) #180

Closed

misrasaurabh1 reviewed May 1, 2025

View reviewed changes

tests/test_code_replacement.py Show resolved Hide resolved

aseembits93 added 5 commits May 1, 2025 15:25

more tests with different positions of global variables and multiple …

0cbd204

…reassignments

more tests

b7d01a8

Merge branch 'main' into cf-616

a8cf3ee

assignments with if/else blocks are not modified

44d9229

quick fix

6db4199

misrasaurabh1 approved these changes May 2, 2025

View reviewed changes

misrasaurabh1 merged commit 3a2dfe7 into main May 2, 2025
15 checks passed

aseembits93 deleted the cf-616 branch May 28, 2025 02:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Global assignments in optimization incorporated in replaced code (CF-616) #179

Global assignments in optimization incorporated in replaced code (CF-616) #179

Uh oh!

aseembits93 commented May 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 1, 2025

Uh oh!

github-actions bot commented May 1, 2025

Uh oh!

codeflash-ai bot commented May 1, 2025

⚡️ Speed up function `add_global_assignments` by 18% in PR #179 (`cf-616`) #180

Uh oh!

Uh oh!

aseembits93 commented May 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Global assignments in optimization incorporated in replaced code (CF-616) #179

Global assignments in optimization incorporated in replaced code (CF-616) #179

Uh oh!

Conversation

aseembits93 commented May 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes walkthrough 📝

Uh oh!

github-actions bot commented May 1, 2025

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented May 1, 2025

PR Code Suggestions ✨

Uh oh!

codeflash-ai bot commented May 1, 2025

⚡️ Codeflash found optimizations for this PR

📄 18% (0.18x) speedup for add_global_assignments in codeflash/code_utils/code_extractor.py

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function add_global_assignments by 18% in PR #179 (cf-616) #180

Uh oh!

Uh oh!

aseembits93 commented May 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aseembits93 commented May 1, 2025 •

edited by github-actions bot

Loading

📄 18% (0.18x) speedup for `add_global_assignments` in `codeflash/code_utils/code_extractor.py`

⚡️ Speed up function `add_global_assignments` by 18% in PR #179 (`cf-616`) #180