Add Buildkite pipeline to check flagship AMIP performance #2438

petebachant · 2026-01-27T13:58:33Z

Resolves #2433

Here we add the ClimaCoupler submodule to check how each PR changes performance in the most important case, since we have determined that the simpler benchmarks in this repo are not indicative enough. I suppose we could limit this to ClimaAtmos, but if the integrated land could be impacted by ClimaCore changes, why not test the whole AMIP pipeline end-to-end?

In lieu of a submodule, we could use a Julia project with the Coupler package in it, but I'm not sure if the AMIP driver script and config files would be accessible.

This pipeline will only run for commits with [perf] in their commit message, so we don't tie up clima with extra jobs. If you need to kick it off, you can always run git commit --allow-empty -m "Trigger build [perf]" && git push.

TODO

Ensure GitHub webhooks are set properly
Check performance threshold against history
Dev install ClimaCore in the appropriate environment
Get performance threshold right -- probably should be 0.08
Should this be triggered manually, i.e., once a PR passes all other checks, we can run this performance check?
Select the appropriate queue
Ensure compiled files are cached between runs

…imaCore.jl into pb/perf-ci

petebachant · 2026-01-27T15:55:44Z

Maybe a better approach here might be to create a single AMIP (or CMIP?) repo and have dependency repos trigger runs there (maybe use a label like amip to indicate it will affect AMIP) and report back to the dependency PRs. That is, create a repo for generating the most important product(s) and focus on that as the main thing. It could also serve as the single source of truth for the "flagship" config and version controlled results artifacts.

In this pipeline we could be selective about what PRs we send to the integrated system repo based on changed paths, e.g.,:

steps:
  - label: ":mag: Check for Numerical Changes"
    key: "check-changes"
    command: |
      # Check if files in src/numerics or src/kernels changed
      if git diff --name-only HEAD~1 | grep -E '^(src/|include/|Makefile)'; then
        buildkite-agent meta-data set "run_e2e" "true"
      else
        buildkite-agent meta-data set "run_e2e" "false"
      fi

  - label: ":rocket: Trigger AMIP E2E"
    if: buildkite_agent_meta_data_get("run_e2e") == "true"
    trigger: "amip-main-pipeline"
    build:
      env:
        CORE_REPO_COMMIT: "${BUILDKITE_COMMIT}"

imreddyTeja · 2026-01-27T18:05:21Z

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

petebachant · 2026-01-27T18:26:56Z

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Interesting. So I can just keep the relative paths even though ClimaCoupler.jl will live in the depot directory, not here?

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

It definitely seems to be stuck, but I can't tell what I did wrong. I created the pipeline on buildkite.com first, then pushed the YAML file to this branch. Let me know (when you get a chance; I know you have a presentation to give soon) if you can see any issues: https://buildkite.com/clima/climacore-end-to-end-performance/settings

ph-kev · 2026-01-27T19:42:41Z

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Interesting. So I can just keep the relative paths even though ClimaCoupler.jl will live in the depot directory, not here?

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

It definitely seems to be stuck, but I can't tell what I did wrong. I created the pipeline on buildkite.com first, then pushed the YAML file to this branch. Let me know (when you get a chance; I know you have a presentation to give soon) if you can see any issues: https://buildkite.com/clima/climacore-end-to-end-performance/settings

I haven't look too closely at the buildkite settings, but I think you need to add the following to the YAML steps:

agents:
      queue: clima

I did this already.

This reverts commit ad2d968.

…perf-ci

petebachant · 2026-01-28T15:08:45Z

This is in a working state now, checking that we never reduce performance by more than 1%:

Still a few TODOs to go though, most importantly caching the compiled code, since this takes about 45 minutes to run. Will take a look at PrecompileCI in ClimaAtmos and maybe see what it takes to use a sysimage if we think this will drastically reduce run time.

I'm also not sure about the queue choice or to run on every push, since it could create a lot of jobs on clima.

imreddyTeja · 2026-01-28T18:11:36Z

I'm not sure how much can be gained by using a sysimage or precompileCI here. My understanding is that almost anything that is compiled for ClimaAtmos, ClimaLand, and ClimaCoupler, would be invalidated because each PR/commit would be using a different version of ClimaCore. Probably still worth looking into though.

I don't think this should be run on every push, but I also don't know what a good alternative would be...

petebachant · 2026-01-28T18:47:39Z

I don't think this should be run on every push, but I also don't know what a good alternative would be...

Yeah, that's a tough one. I did at least set it up to skip the simulation if src, ext, or .buildkite/perf are unchanged compared to main.

petebachant · 2026-01-29T15:21:02Z

@nefrathenrici do you think it would be smarter/possible to put this into the central queue? We'd want to ensure it runs on the same GPU type every time to make SYPD comparable. I'm worried about tying up clima with these jobs.

nefrathenrici · 2026-01-29T20:11:18Z

@nefrathenrici do you think it would be smarter/possible to put this into the central queue? We'd want to ensure it runs on the same GPU type every time to make SYPD comparable. I'm worried about tying up clima with these jobs.

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

petebachant · 2026-01-29T20:28:17Z

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

The actual timestepping part only take ~30 seconds. Apparently it's the compilation that takes most of the time, as I'm told by @imreddyTeja and @dennisYatunin.

nefrathenrici · 2026-01-29T20:36:51Z

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

The actual timestepping part only take ~30 seconds. Apparently it's the compilation that takes most of the time, as I'm told by @imreddyTeja and @dennisYatunin.

I think central is a reasonable choice in that case. Could we run this pipeline only on merge?

petebachant · 2026-01-29T20:37:37Z

Looks like Buildkite does have some ability to skip builds. Maybe we could flip that around and have people opt-in by putting [perf] somewhere in the title or description. Other PRs would therefore not kick off the performance check.

petebachant · 2026-01-29T20:39:13Z

I think central is a reasonable choice in that case. Could we run this pipeline only on merge?

IMO it would be nice to know that a PR either enhances or at least doesn't hurt performance before merging.

petebachant · 2026-01-29T20:49:29Z

I just set up conditional filtering so builds only run if [perf] is in the commit message. That seems like a reasonable balance.

imreddyTeja · 2026-01-29T21:04:11Z

The pipeline looks good to me. My understanding of submodules is weak, but it seems convenient. An alternative would be making a different Julia Project that has ClimaCore dev'ed, and ClimaCoupler added. Then you could use pkgdir(ClimaCoupler), which contains the experiments folder from ClimaCoupler. If @dennisYatunin is OK with the submodule, I think this should be merged (after squashing)

dennisYatunin

I think the submodule is pretty neat. Unlike a shared repo, submodules could maybe let us pin different combinations of upstream packages for different repos. That way, incompatible code introduced upstream would have a minimal effect on CI for downstream packages.

Can the current filter on "build.message" get overruled on merge, though? Even if someone doesn't know/care about the performance check, it would still be good to run it once per PR.

petebachant and others added 6 commits January 26, 2026 14:38

Start pipeline for downstream perf checks

13848f1

Add ClimaCoupler submodule for downstream perf checks

7b8beae

Merge branches 'pb/perf-ci' and 'main' of https://github.com/CLiMA/Cl…

3f2b66b

…imaCore.jl into pb/perf-ci

Parameterize coupler path

a292200

Dev climacore

be0b2e5

Add julia load and depot paths

8c67d87

petebachant requested review from dennisYatunin and imreddyTeja January 27, 2026 14:29

petebachant and others added 15 commits January 27, 2026 13:59

Remove julia load path

e938985

Make pipeline conditional on git diff

e3c4fe1

Switch to julia project instead of submodule

ad2d968

Revert "Switch to julia project instead of submodule"

927a540

This reverts commit ad2d968.

Be sure to init submodule

56875a2

Fix conditionals

b65f181

Use env file

b1f9901

Use a soft fail for downstream steps

1c1f8aa

Add MPI package

e1b25e4

Check SYPD with shell command

b69b659

Combine steps

84094aa

Escape $

c3b41ce

Merge branch 'main' of https://github.com/CLiMA/ClimaCore.jl into pb/…

9d814eb

…perf-ci

Reformulate in terms of percent change

a51c641

Move sypd check into script

5c350fc

Remove erroneous space

51acd19

Add dep and remove wait/group

388bf61

Merge branch 'main' into pb/perf-ci

1303c22

petebachant changed the title ~~Add Buildkite pipeline to check flagship AMIP performance~~ Add Buildkite pipeline to check flagship AMIP performance [perf] Jan 29, 2026

petebachant changed the title ~~Add Buildkite pipeline to check flagship AMIP performance [perf]~~ Add Buildkite pipeline to check flagship AMIP performance Jan 29, 2026

Trigger build [perf]

5af715e

dennisYatunin approved these changes Jan 29, 2026

View reviewed changes

Add Buildkite pipeline to check flagship AMIP performance #2438

Are you sure you want to change the base?

Add Buildkite pipeline to check flagship AMIP performance #2438

Conversation

petebachant commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Uh oh!

petebachant commented Jan 27, 2026

Uh oh!

imreddyTeja commented Jan 27, 2026

Uh oh!

petebachant commented Jan 27, 2026

Uh oh!

ph-kev commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petebachant commented Jan 28, 2026

Uh oh!

imreddyTeja commented Jan 28, 2026

Uh oh!

petebachant commented Jan 28, 2026

Uh oh!

petebachant commented Jan 29, 2026

Uh oh!

nefrathenrici commented Jan 29, 2026

Uh oh!

petebachant commented Jan 29, 2026

Uh oh!

nefrathenrici commented Jan 29, 2026

Uh oh!

petebachant commented Jan 29, 2026

Uh oh!

petebachant commented Jan 29, 2026

Uh oh!

petebachant commented Jan 29, 2026

Uh oh!

imreddyTeja commented Jan 29, 2026

Uh oh!

dennisYatunin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

petebachant commented Jan 27, 2026 •

edited

Loading

ph-kev commented Jan 27, 2026 •

edited

Loading