Skip to content

Conversation

@petebachant
Copy link
Member

@petebachant petebachant commented Jan 27, 2026

Resolves #2433

Here we add the ClimaCoupler submodule to check how each PR changes performance in the most important case, since we have determined that the simpler benchmarks in this repo are not indicative enough. I suppose we could limit this to ClimaAtmos, but if the integrated land could be impacted by ClimaCore changes, why not test the whole AMIP pipeline end-to-end?

In lieu of a submodule, we could use a Julia project with the Coupler package in it, but I'm not sure if the AMIP driver script and config files would be accessible.

This pipeline will only run for commits with [perf] in their commit message, so we don't tie up clima with extra jobs. If you need to kick it off, you can always run git commit --allow-empty -m "Trigger build [perf]" && git push.

TODO

  • Ensure GitHub webhooks are set properly
  • Check performance threshold against history
  • Dev install ClimaCore in the appropriate environment
  • Get performance threshold right -- probably should be 0.08
  • Should this be triggered manually, i.e., once a PR passes all other checks, we can run this performance check?
  • Select the appropriate queue
  • Ensure compiled files are cached between runs

@petebachant
Copy link
Member Author

Maybe a better approach here might be to create a single AMIP (or CMIP?) repo and have dependency repos trigger runs there (maybe use a label like amip to indicate it will affect AMIP) and report back to the dependency PRs. That is, create a repo for generating the most important product(s) and focus on that as the main thing. It could also serve as the single source of truth for the "flagship" config and version controlled results artifacts.

In this pipeline we could be selective about what PRs we send to the integrated system repo based on changed paths, e.g.,:

steps:
  - label: ":mag: Check for Numerical Changes"
    key: "check-changes"
    command: |
      # Check if files in src/numerics or src/kernels changed
      if git diff --name-only HEAD~1 | grep -E '^(src/|include/|Makefile)'; then
        buildkite-agent meta-data set "run_e2e" "true"
      else
        buildkite-agent meta-data set "run_e2e" "false"
      fi

  - label: ":rocket: Trigger AMIP E2E"
    if: buildkite_agent_meta_data_get("run_e2e") == "true"
    trigger: "amip-main-pipeline"
    build:
      env:
        CORE_REPO_COMMIT: "${BUILDKITE_COMMIT}"

@imreddyTeja
Copy link
Member

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

@petebachant
Copy link
Member Author

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Interesting. So I can just keep the relative paths even though ClimaCoupler.jl will live in the depot directory, not here?

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

It definitely seems to be stuck, but I can't tell what I did wrong. I created the pipeline on buildkite.com first, then pushed the YAML file to this branch. Let me know (when you get a chance; I know you have a presentation to give soon) if you can see any issues: https://buildkite.com/clima/climacore-end-to-end-performance/settings

@ph-kev
Copy link
Member

ph-kev commented Jan 27, 2026

I don't really know how submodules work in git, but if we used a Julia project with the coupler in it, the AMIP driver script and config files should still be accessible.

Interesting. So I can just keep the relative paths even though ClimaCoupler.jl will live in the depot directory, not here?

Also I'm not sure if this pipeline has been setup correctly because it is not uploading. Do you have an estimate on how long the e2e test takes?

It definitely seems to be stuck, but I can't tell what I did wrong. I created the pipeline on buildkite.com first, then pushed the YAML file to this branch. Let me know (when you get a chance; I know you have a presentation to give soon) if you can see any issues: https://buildkite.com/clima/climacore-end-to-end-performance/settings

I haven't look too closely at the buildkite settings, but I think you need to add the following to the YAML steps:

agents:
      queue: clima

I did this already.

@petebachant
Copy link
Member Author

This is in a working state now, checking that we never reduce performance by more than 1%:

image

Still a few TODOs to go though, most importantly caching the compiled code, since this takes about 45 minutes to run. Will take a look at PrecompileCI in ClimaAtmos and maybe see what it takes to use a sysimage if we think this will drastically reduce run time.

I'm also not sure about the queue choice or to run on every push, since it could create a lot of jobs on clima.

@imreddyTeja
Copy link
Member

I'm not sure how much can be gained by using a sysimage or precompileCI here. My understanding is that almost anything that is compiled for ClimaAtmos, ClimaLand, and ClimaCoupler, would be invalidated because each PR/commit would be using a different version of ClimaCore. Probably still worth looking into though.

I don't think this should be run on every push, but I also don't know what a good alternative would be...

@petebachant
Copy link
Member Author

I don't think this should be run on every push, but I also don't know what a good alternative would be...

Yeah, that's a tough one. I did at least set it up to skip the simulation if src, ext, or .buildkite/perf are unchanged compared to main.

@petebachant
Copy link
Member Author

@nefrathenrici do you think it would be smarter/possible to put this into the central queue? We'd want to ensure it runs on the same GPU type every time to make SYPD comparable. I'm worried about tying up clima with these jobs.

@nefrathenrici
Copy link
Member

@nefrathenrici do you think it would be smarter/possible to put this into the central queue? We'd want to ensure it runs on the same GPU type every time to make SYPD comparable. I'm worried about tying up clima with these jobs.

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

@petebachant
Copy link
Member Author

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

The actual timestepping part only take ~30 seconds. Apparently it's the compilation that takes most of the time, as I'm told by @imreddyTeja and @dennisYatunin.

@nefrathenrici
Copy link
Member

central is a good option if the jobs aren't too long, we have a reserved GPU node meant for quick CI jobs. Could we limit this test to only run a few steps?

The actual timestepping part only take ~30 seconds. Apparently it's the compilation that takes most of the time, as I'm told by @imreddyTeja and @dennisYatunin.

I think central is a reasonable choice in that case. Could we run this pipeline only on merge?

@petebachant
Copy link
Member Author

Looks like Buildkite does have some ability to skip builds. Maybe we could flip that around and have people opt-in by putting [perf] somewhere in the title or description. Other PRs would therefore not kick off the performance check.

@petebachant
Copy link
Member Author

I think central is a reasonable choice in that case. Could we run this pipeline only on merge?

IMO it would be nice to know that a PR either enhances or at least doesn't hurt performance before merging.

@petebachant petebachant changed the title Add Buildkite pipeline to check flagship AMIP performance Add Buildkite pipeline to check flagship AMIP performance [perf] Jan 29, 2026
@petebachant petebachant changed the title Add Buildkite pipeline to check flagship AMIP performance [perf] Add Buildkite pipeline to check flagship AMIP performance Jan 29, 2026
@petebachant
Copy link
Member Author

I just set up conditional filtering so builds only run if [perf] is in the commit message. That seems like a reasonable balance.

@imreddyTeja
Copy link
Member

The pipeline looks good to me. My understanding of submodules is weak, but it seems convenient. An alternative would be making a different Julia Project that has ClimaCore dev'ed, and ClimaCoupler added. Then you could use pkgdir(ClimaCoupler), which contains the experiments folder from ClimaCoupler. If @dennisYatunin is OK with the submodule, I think this should be merged (after squashing)

Copy link
Member

@dennisYatunin dennisYatunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the submodule is pretty neat. Unlike a shared repo, submodules could maybe let us pin different combinations of upstream packages for different repos. That way, incompatible code introduced upstream would have a minimal effect on CI for downstream packages.

Can the current filter on "build.message" get overruled on merge, though? Even if someone doesn't know/care about the performance check, it would still be good to run it once per PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automatically check AMIP performance for each ClimaCore PR before merging

6 participants