Skip to content

Support SolverBenchmarks#7

Open
MaxenceGollier wants to merge 60 commits intoJuliaSmoothOptimizers:mainfrom
MaxenceGollier:solver_benchmarks
Open

Support SolverBenchmarks#7
MaxenceGollier wants to merge 60 commits intoJuliaSmoothOptimizers:mainfrom
MaxenceGollier:solver_benchmarks

Conversation

@MaxenceGollier
Copy link

Hi @tmigot, @dpo.

I added a function run_solver_benchmarks that allows to make a run with, for example, SolverBenchmarks.

How does it work ?

Instead of being limited to the benchmark suits from PkgBenchmark.jl, this function executes a specified script from the repository being benchmarked on the current branch and on a specified target_branch.

This benchmark script can do whatever the user likes but should currently return a Dict like those returned from bmark_solvers.
The function then proceeds to construct performance profiles.

Example use:

In RegularizedOptimization.jl i wrote a simple benchmark script as

using RegularizedProblems, RegularizedOptimization, SolverBenchmark

bpdn_l0, _ = setup_bpdn_l0()
bpdn_l1, _ = setup_bpdn_l1()
bpdn_B0, _ = setup_bpdn_B0()

problem_list = [
  bpdn_l0,
  bpdn_l1,
  bpdn_B0,
]

solvers = Dict(
  :R2_precise => 
    reg_nlp -> R2(
      reg_nlp,
      verbose = 0,
      atol = 1e-6,
      rtol = 1e-6,
    ),
  :R2_imprecise => 
    reg_nlp -> R2(
      reg_nlp,
      verbose = 0,
      atol = 1e-3,
      rtol = 1e-3,
    )
)

stats = bmark_solvers(solvers, problem_list)

We also added a workflow that runs on PRs:

name: JSOBenchmarks
on:
  pull_request:
    types: [labeled, opened, synchronize, reopened]
  workflow_call:

jobs:
  bmark:
    name: Julia ${{ matrix.version }} - macOS - ${{ matrix.arch }} - ${{ github.event_name }}
    if: github.event_name == 'workflow_call' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run benchmarks'))
    # FIXME: should run on hosted runner
    runs-on: macOS-latest
    strategy:
      fail-fast: false
      matrix:
        version:
          - 1
        arch:
          - aarch64
    steps:
      - uses: actions/checkout@v6
        with:
          fetch-depth: 0
      - uses: julia-actions/setup-julia@v2
        with:
          version: ${{ matrix.version }}
          arch: ${{ matrix.arch }}
      - uses: julia-actions/cache@v2
      - uses: julia-actions/julia-buildpkg@v1
        # FIXME: register JSOBenchmarks
      - name: Installing non-registered dependencies
        run: |
          using Pkg
          pkg1 = PackageSpec(url = "https://github.com/MaxenceGollier/JSOBenchmarks.jl.git", rev = "solver_benchmarks")
          pkg2 = PackageSpec(url = "https://github.com/MaxenceGollier/SolverBenchmark.jl.git", rev = "switch-getters")
          pkg_list = [pkg1, pkg2]
          Pkg.add(pkg_list)
        shell: julia --project=benchmark --color=yes {0}
      - name: Install benchmark dependencies
        run: julia --project=benchmark -e 'using Pkg; Pkg.instantiate()'
      - name: Sanitize project name
        id: sanitize
        run: echo "REPONAME=${{ github.event.repository.name }}" | sed -e 's/\.jl$//' >> $GITHUB_OUTPUT
      - name: Run benchmarks
        run: julia --project=benchmark -e 'using JSOBenchmarks; run_solver_benchmarks("${{ steps.sanitize.outputs.REPONAME }}", "benchmark", reference_branch = "${{ github.event.pull_request.base.ref }}")'
        env:
          GITHUB_AUTH: ${{ secrets.GIST_TOKEN }}
      - name: Build comment
        id: build-comment
        uses: actions/github-script@v6
        with:
          github-token: ${{ github.token }}
          result-encoding: string
          script: |
            const fs = require('fs');
            return fs.readFileSync("${{ github.workspace }}/bmark_${{ github.sha }}.md", "utf8").toString();
      - name: Comment in PR
        uses: thollander/actions-comment-pull-request@v2
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          message: ${{ steps.build-comment.outputs.result }}
      - name: Upload artifacts
        uses: actions/upload-artifact@v7
        with:
          name: jso-benchmarks
          path: |
            profiles_this_commit_vs_reference_*.svg
            *_vs_reference_*.jld2
            bmark_*.md
            reference.md
            judgement_*.md

I opened a PR where i add a bad modification to R2: JuliaSmoothOptimizers/RegularizedOptimization.jl#309.
This gave the following gist:
https://gist.github.com/dpo/bfe8576c5f8e93940a57007513aebb0f

This is mostly a proof of concept and I am open to your suggestions on what we could add to the function to make it more flexible!

Thank you.

Copilot AI review requested due to automatic review settings March 9, 2026 14:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for running solver-style benchmarks from an arbitrary script in the benchmarked repository (rather than being limited to PkgBenchmark.jl suites), and integrates it into JSOBenchmarks.

Changes:

  • Introduces run_solver_benchmarks to execute a user-provided benchmark script on the current branch and a reference branch, then build performance profiles.
  • Exposes the new functionality from the main module and makes the Markdown report generation tolerate missing/empty sections.
  • Adds LibGit2 as a dependency for checking out and running the reference branch script.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 7 comments.

File Description
src/solver_benchmarks.jl New benchmark runner that executes a repo script across branches and builds SolverBenchmark performance profiles.
src/JSOBenchmarks.jl Includes and exports run_solver_benchmarks; makes write_simple_md_report conditional on available content.
Project.toml Adds LibGit2 dependency/compat for branch checkout functionality.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +45
# Run the benchmark script on this commit
this_commit = Base.include(Main, joinpath(bmark_dir, script))
@assert this_commit isa Dict{Symbol, DataFrame} "Expected the benchmark script to return a Dict{Symbol, DataFrame}, but got $(typeof(this_commit)). Make sure your benchmark script returns a dict resulting from BenchmarkSolver.bmark_solver function"

# Run the benchmark script on the reference branch
local reference
if is_git
repo_dir = joinpath(bmark_dir, "..")
repo = LibGit2.GitRepo(repo_dir)
println("repo_dir : $repo_dir")
println("bmark_dir : $bmark_dir")
reference = _withcommit(joinpath(bmark_dir, script), repo, reference_branch)
end
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference-branch run is done by checkout!-ing and then Base.include(Main, script) in the same Julia process. If the script does using <PackageBeingBenchmarked>, Julia will not reload the module after the checkout, so the “reference” benchmarks can accidentally run against the already-loaded current-branch code. To ensure correct comparisons, run each branch/commit’s script in a fresh Julia process (or load into isolated modules/processes and explicitly restart the session between runs) and deserialize the returned results.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, it looks like this is how they do it in PkgBenchmarks.jl

@MaxenceGollier
Copy link
Author

@dpo, @tmigot, I added a doc.

The code is messy, it rewrites a lot of stuff from run_benchmarks.

Maybe we should update/clean both codes in a separate PR ? I agree that merging my messy code is not ideal but i have at least a proof of concept in RegularizedOptimization. Both my function and run_benchmarks should be subdivised in multiple functions.
Having no tests is definitely problematic here as well.

What do you think @dpo ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants