Skip to content

feat: add Gradle build tool support for Java optimization#1774

Open
HeshamHM28 wants to merge 47 commits intomainfrom
feat/gradle-executor-from-java
Open

feat: add Gradle build tool support for Java optimization#1774
HeshamHM28 wants to merge 47 commits intomainfrom
feat/gradle-executor-from-java

Conversation

@HeshamHM28
Copy link
Contributor

Summary

  • Adds full Gradle executor support alongside existing Maven support, using init scripts to inject configuration without modifying project build files
  • Implements Gradle compilation, classpath extraction, test execution, multi-module detection, runtime JAR installation, and JaCoCo coverage — all via --init-script
  • Dispatches between Maven and Gradle at the entry points (run_behavioral_tests, run_benchmarking_tests, run_line_profile_tests) based on detect_build_tool()
  • Reuses the existing build-tool-agnostic direct JVM execution path (_run_tests_direct) for the fast compile-once-run-many pattern

Files changed

File Changes
build_tools.py Init script management, classpath extraction, project info, runtime install, JaCoCo, multi-module detection
test_runner.py Gradle compilation/execution, entry point dispatch, generalized multi-module root detection
config.py Gradle compiler settings detection

Test plan

  • All 788 existing Java tests pass (0 regressions)
  • All 1499 language tests pass
  • Verified on simple Gradle project: build tool detection, project info parsing, compilation, classpath extraction, test execution, direct JVM execution
  • ruff check and ruff format pass
  • End-to-end codeflash optimize test blocked by SSL certificate issue in environment (API unreachable)
  • Test on multi-module Gradle project (rewrite repo requires Java 8 toolchain)

🤖 Generated with Claude Code

@CLAassistant
Copy link

CLAassistant commented Mar 5, 2026

CLA assistant check
All committers have signed the CLA.

@claude
Copy link
Contributor

claude bot commented Mar 5, 2026

Claude finished @HeshamHM28's task in 6m 41s —— View job


PR Review Summary

  • Triage: assess scope (substantive changes)
  • Lint & type check
  • Resolve stale threads (none found)
  • Code review
  • Duplicate detection
  • Test coverage
  • Merge optimization PRs (none open)

Prek Checks

ruff check and ruff format — clean, no issues.

Mypy found 2 pre-existing errors in function_optimizer.py (not introduced by this PR):

  • codeflash.languages.base does not export Language (used via indirect import)
  • LanguageSupport has no attribute replace_function_definitions (called via protocol dispatch at line 438)

Neither is new — both are pre-existing mypy gaps not caused by this PR.


Code Review

Bug 1 (Medium): setup_coverage still uses raw brace counting for plugins {} block

gradle_strategy.py:781-793 — the plugins { } injection still walks the string character-by-character counting {/}. This is the same class of bug that add_codeflash_dependency had before being fixed with tree-sitter. Groovy plugins blocks can contain id "com.github.spotbugs" version "${spotbugsVersion}" where the } in ${spotbugsVersion} will prematurely terminate the scan. The fix to add_codeflash_dependency used tree-sitter; setup_coverage should do the same, or at minimum use a tree-sitter-backed approach for the plugins {} block too.

# gradle_strategy.py:784-793 — vulnerable to ${var} interpolation in plugin strings
for i in range(plugins_idx, len(content)):
    if content[i] == "{":
        brace_depth += 1
    elif content[i] == "}":
        brace_depth -= 1
        if brace_depth == 0:

Fix this →

Bug 2 (Medium): jacocoTestReport not qualified with module prefix in multi-module builds

gradle_strategy.py:604 — when test_module is set, the test task is correctly qualified as :{test_module}:test, but jacocoTestReport is appended as an unqualified task name. In multi-module Gradle projects, this causes Gradle to run jacocoTestReport for all subprojects (or fail if the root project doesn't have the jacoco plugin applied). It should be :{test_module}:jacocoTestReport.

# Current (line 604):
cmd.append("jacocoTestReport")

# Should be:
cmd.append(f":{test_module}:jacocoTestReport" if test_module else "jacocoTestReport")

Fix this →

Bug 3 (Low): _test_framework set twice in ensure_runtime_environment

support.py:548self._test_framework = config.test_framework is called in ensure_runtime_environment, but setup_test_config (line 410) does the same thing. Both paths call detect_java_project. This is harmless but redundant — if the config returns different results for different call sites (e.g., different project_root_path vs project_root), the second call would silently overwrite the first. Worth a comment or deduplication.

Observation: Module-level mutable caches in gradle_strategy.py and maven_strategy.py

Both files have module-level singletons (_classpath_cache, _multimodule_deps_installed, _skip_validation_init_path) that persist for the lifetime of the process. This is correct for production use (avoid re-running Gradle to get the classpath on every candidate). However, tests that mock or stub these paths must be careful to reset them between runs. The CompilationCache in test_runner.py uses a proper clear() method; consider doing the same for these dicts if test isolation ever becomes a problem.

Observation: _get_project_classpath cache pattern is unusual

function_optimizer.py:355-375 — the cache uses hasattr(self, "_cached_project_classpath") but the class body also has _cached_project_classpath: str | None as a class-level annotation. The hasattr check will always return True because of the class-level attribute declaration (even if the instance value hasn't been set yet). The instance-level assignment at line 373 (self._cached_project_classpath = classpath) shadows the class annotation, but the first hasattr check at line 363 will still see the class annotation. In practice this works (the annotation sets no value at class level) but it's subtly fragile — a better pattern is try/except AttributeError or using a sentinel like _UNSET = object().


Duplicate Detection

MEDIUM — add_codeflash_dependency defined in two files with different signatures

  • maven_strategy.py:155add_codeflash_dependency(pom_path: Path) -> bool (Maven XML insertion)
  • gradle_strategy.py:183add_codeflash_dependency(build_file: Path, runtime_jar_path: Path) -> bool (Gradle tree-sitter insertion)
  • __init__.py:55 imports the Maven version only

These are intentionally different (different build file formats), so the name collision is the issue. Since both are exported/imported, it's easy to accidentally call the wrong one. Consider renaming the Gradle one to add_codeflash_gradle_dependency for clarity.

No other duplicates detected_extract_modules_from_pom_content and _extract_modules_from_settings_gradle in test_runner.py are the single source of truth; other module detection code defers to them.


Test Coverage

File Statements Miss Coverage
gradle_strategy.py 444 335 25%
maven_strategy.py 493 279 43%
build_tool_strategy.py 66 13 80%
test_runner.py 736 309 58%
function_optimizer.py 228 153 33%

gradle_strategy.py at 25% is expected for a new file that requires a real Gradle project to exercise the I/O paths. The GradleStrategy methods that need coverage: ensure_runtime, install_multi_module_deps, get_classpath, run_tests_via_build_tool, run_benchmarking_via_build_tool, setup_coverage. These are all integration-level paths that are hard to unit test without mocking subprocess, but unit tests for the pure-logic helpers (_parse_classpath_output, get_reports_dir, get_build_output_dir) and the tree-sitter dependency injection (_find_top_level_dependencies_block, add_codeflash_dependency) would be straightforward to add.


Overall: The refactor to a strategy pattern is clean and the Gradle support looks solid. The two medium bugs (brace counting in setup_coverage and unqualified jacocoTestReport in multi-module builds) should be fixed before merging, as the E2E validation on Netflix Eureka/Zuul has already confirmed both are triggered in real projects.

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 6, 2026

⚡️ Codeflash found optimizations for this PR

📄 73% (0.73x) speedup for _extract_java_version_from_gradle in codeflash/languages/java/build_tools.py

⏱️ Runtime : 1.65 milliseconds 954 microseconds (best of 243 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 7, 2026

⚡️ Codeflash found optimizations for this PR

📄 112% (1.12x) speedup for get_gradle_test_reports_dir in codeflash/languages/java/build_tools.py

⏱️ Runtime : 18.0 milliseconds 8.47 milliseconds (best of 130 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 7, 2026

⚡️ Codeflash found optimizations for this PR

📄 84% (0.84x) speedup for get_gradle_test_classes_dir in codeflash/languages/java/build_tools.py

⏱️ Runtime : 25.9 milliseconds 14.1 milliseconds (best of 124 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

claude bot added a commit that referenced this pull request Mar 7, 2026
…2026-03-07T00.07.24

⚡️ Speed up function `get_gradle_test_reports_dir` by 112% in PR #1774 (`feat/gradle-executor-from-java`)
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 7, 2026

This PR is now faster! 🚀 @claude[bot] accepted my optimizations from:

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

⚡️ Codeflash found optimizations for this PR

📄 413% (4.13x) speedup for GradleStrategy.get_reports_dir in codeflash/languages/java/test_runner.py

⏱️ Runtime : 10.2 milliseconds 1.98 milliseconds (best of 172 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

⚡️ Codeflash found optimizations for this PR

📄 28% (0.28x) speedup for _get_strategy in codeflash/languages/java/test_runner.py

⏱️ Runtime : 2.58 milliseconds 2.02 milliseconds (best of 150 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

- Updated tests in `test_java_multimodule_deps_install.py` to utilize `MavenStrategy` for installing multi-module dependencies.
- Changed function calls from `ensure_multi_module_deps_installed` to `MavenStrategy.install_multi_module_deps`.
- Added a fixture for `MavenStrategy` to streamline test setup.
- Modified assertions and mock setups to align with the new strategy implementation.

- Refactored tests in `test_java_test_filter_validation.py` to replace `_run_maven_tests` with `MavenStrategy.run_tests_via_build_tool`.
- Adjusted test cases to ensure proper handling of empty and valid test filters.
- Updated mock setups for Maven executable and command execution to reflect changes in the strategy.
@HeshamHM28 HeshamHM28 force-pushed the feat/gradle-executor-from-java branch from 41a4e02 to f4a4ac6 Compare March 9, 2026 21:31
github-actions bot and others added 2 commits March 9, 2026 21:38
Replacing the chained `/` operator (`build_root / test_module / "target"`) with `build_root.joinpath(test_module, _TARGET)` eliminates the intermediate Path object created after `build_root / test_module`, cutting per-call overhead from ~15.9 µs to ~10.4 µs in the hot path (3323 hits). The profiler shows the hot line dropped from 97.3% to 96.3% of runtime, and hoisting `"target"` into a module constant `_TARGET` avoids repeated string allocations. Runtime improved 51% (9.70 ms → 6.40 ms) with no functional regressions across all test cases.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

⚡️ Codeflash found optimizations for this PR

📄 52% (0.52x) speedup for MavenStrategy.get_build_output_dir in codeflash/languages/java/maven_strategy.py

⏱️ Runtime : 9.70 milliseconds 6.40 milliseconds (best of 148 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

…2026-03-09T22.13.14

⚡️ Speed up method `MavenStrategy.get_build_output_dir` by 52% in PR #1774 (`feat/gradle-executor-from-java`)
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

The optimized code eliminates two allocation-heavy steps in `_validate_test_filter`: building an intermediate list via `[p.strip() for p in test_filter.split(",")]` and unconditionally calling `pattern.replace("*", "A")` even when no wildcard exists. By iterating directly over `split(",")` and guarding `replace` with an `if "*" in pattern` check, the hot loop avoids ~2.5 ms of string allocations per 1000-pattern call (profiler shows the list comprehension took 11.6% of original time). Additionally, replacing `bool(_VALID_JAVA_CLASS_NAME.match(...))` with `... is not None` in both functions removes unnecessary type conversions, though the filter loop accounts for the bulk of the 26% runtime improvement across Maven test-execution paths that validate comma-separated test filters.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

⚡️ Codeflash found optimizations for this PR

📄 27% (0.27x) speedup for _validate_test_filter in codeflash/languages/java/test_runner.py

⏱️ Runtime : 3.14 milliseconds 2.48 milliseconds (best of 98 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

⚡️ Codeflash found optimizations for this PR

📄 13% (0.13x) speedup for _run_direct_or_fallback in codeflash/languages/java/test_runner.py

⏱️ Runtime : 3.92 milliseconds 3.48 milliseconds (best of 156 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

…2026-03-09T22.52.33

⚡️ Speed up function `_validate_test_filter` by 27% in PR #1774 (`feat/gradle-executor-from-java`)
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 9, 2026

This PR is now faster! 🚀 @claude[bot] accepted my optimizations from:

HeshamHM28 and others added 7 commits March 12, 2026 08:46
Matches Maven's -Dmaven.test.failure.ignore=true and -DfailIfNoTests=false
so coverage runs complete jacocoTestReport and multi-module --tests filters
don't abort on modules with no matching tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fragile brace-counting string manipulation with tree-sitter
Groovy/Kotlin parsers to find the top-level dependencies block. This
correctly ignores nested blocks inside buildscript, subprojects, etc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
-x taskName fails if the task doesn't exist, breaking projects without
those plugins. Init script safely disables tasks only if present, matching
Maven's -Dcheckstyle.skip=true behavior which is silently ignored.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mashraf-222
Copy link
Contributor

Blocker: Error Prone -Werror causes 100% compilation failure on Zuul (and any project using Error Prone)

Summary

Ran E2E --all optimization on Netflix Zuul (zuul-core, 820 functions) with the latest 5 commits on this branch. 82% of functions fail to compile because codeflash's instrumentation variables (_cf_*) trigger Error Prone [UnusedVariable] warnings, which become fatal compilation errors due to -Werror in Zuul's build.gradle.

The current _GRADLE_SKIP_VALIDATION_INIT_SCRIPT disables task-based static analysis plugins (checkstyle, spotbugs, pmd, rat, japicmp), but Error Prone is a compiler plugin — it runs inside javac via JavaCompile task options, not as a standalone Gradle task. The init script doesn't touch it.

What's happening

Zuul's build.gradle configures Error Prone as a compiler plugin with -Werror:

tasks.withType(JavaCompile).configureEach {
    dependencies {
        errorprone "com.uber.nullaway:nullaway:0.12.4"
        errorprone "com.google.errorprone:error_prone_core:2.45.0"
    }
    options.compilerArgs << "-Werror"
    options.errorprone { ... }
}

Codeflash instrumentation adds variables like _cf_fn, _cf_mod, _cf_cls, _cf_loop, _cf_test, _cf_outputFile, _cf_iter to the generated test classes. Some of these variables are declared but not read in all code paths, which triggers:

warning: [UnusedVariable] The local variable '_cf_test4' is never read.
        String _cf_test4 = "testServerInstanceAllocation_BypassesConstructor_InstanceNotNull";
               ^
    (see https://errorprone.info/bugpattern/UnusedVariable)

warning: [UnusedVariable] The local variable '_cf_outputFile4' is never read.
        String _cf_outputFile4 = System.getenv("CODEFLASH_OUTPUT_FILE");
               ^

warning: [UnusedVariable] The local variable '_cf_mod2' is never read.
        String _cf_mod2 = "ServerTest__perfinstrumented";
               ^

With -Werror, these warnings become fatal:

error: warnings found and -Werror specified

Scale of impact (from E2E session)

  • Functions processed: 360 / 820
  • Compilation failures: 297 (82.5%)
  • Total [UnusedVariable] warnings: 10,072
  • Total -Werror fatal errors: 349
  • Distinct _cf_* variable names triggering errors: 80+ (e.g., _cf_fn1_cf_fn14, _cf_mod1_cf_mod14, _cf_cls1_cf_cls14, _cf_loop*, _cf_test*, _cf_outputFile*, _cf_iter*)
  • PRs created: 0

What the init script currently covers vs what's needed

Currently disabled (task-based plugins — works correctly):

checkstyleMain, checkstyleTest, spotbugsMain, spotbugsTest,
pmdMain, pmdTest, rat, japicmp

NOT disabled (compiler plugin — Error Prone):

  • options.compilerArgs << "-Werror" on JavaCompile tasks
  • Error Prone checks via options.errorprone { ... }

Suggested fix

Add Error Prone handling to the init script by stripping -Werror from compiler args:

gradle.projectsEvaluated {
    allprojects {
        tasks.withType(JavaCompile) {
            options.compilerArgs.removeAll { it == '-Werror' }
        }
    }
}

Alternatively (or additionally), suppress Error Prone's UnusedVariable check for test compilation:

gradle.projectsEvaluated {
    allprojects {
        tasks.withType(JavaCompile) {
            options.compilerArgs.removeAll { it == '-Werror' }
            if (options.hasProperty('errorprone')) {
                options.errorprone {
                    check('UnusedVariable', net.ltgt.gradle.errorprone.CheckSeverity.OFF)
                    check('EmptyCatch', net.ltgt.gradle.errorprone.CheckSeverity.OFF)
                }
            }
        }
    }
}

Context: What IS working well

For comparison, the same session on Eureka (JUnit 4 project, no Error Prone) produced:

  • 109/411 functions processed, 50 compilations succeeded (46%)
  • 7 speedups found (15.9%, 20.5%, 35.5%, 11.9%, 5.6%, 41.4%, 18.9%)
  • 5 PRs created — the full Gradle pipeline works end-to-end when Error Prone is not present

The bug fixes in this branch (tree-sitter for build.gradle injection, JUnit version detection) are working correctly. Error Prone is the only remaining blocker for projects that use it.

Base automatically changed from omni-java to main March 14, 2026 00:40
@HeshamHM28 HeshamHM28 requested a review from mashraf-222 March 17, 2026 02:04
HeshamHM28 and others added 2 commits March 17, 2026 04:12
- Remove duplicate setup_test_config method in JavaSupport
- Fix import sort order in support.py
- Fix superfluous-else-return in test_runner.py
- Add missing newline at end of build_tools.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 17, 2026

⚡️ Codeflash found optimizations for this PR

📄 21% (0.21x) speedup for get_optimized_code_for_module in codeflash/languages/code_replacer.py

⏱️ Runtime : 18.8 milliseconds 15.5 milliseconds (best of 37 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/gradle-executor-from-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 18, 2026

⚡️ Codeflash found optimizations for this PR

📄 22% (0.22x) speedup for _match_module_from_rel_path in codeflash/languages/java/test_runner.py

⏱️ Runtime : 3.77 milliseconds 3.11 milliseconds (best of 47 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants