Conversation
Signed-off-by: krishna-kg732 <krishnagupta.kg2k6@gmail.com>
Signed-off-by: krishna-kg732 <krishnagupta.kg2k6@gmail.com>
Signed-off-by: krishna-kg732 <krishnagupta.kg2k6@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
🎉 Welcome to the Kubeflow Trainer! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
There was a problem hiding this comment.
Pull request overview
This PR introduces comprehensive release automation infrastructure for the Kubeflow Trainer project and updates version information for what appears to be a test release. The PR title indicates v2.2.0, but the actual changes reference v2.2.2, and test data from a v99.0.0 release is also present.
Changes:
- Adds release automation scripts and workflows (release.sh, release.yaml, check-release.yaml)
- Adds changelog generation configuration (cliff.toml)
- Updates version to v2.2.2 across all manifests, charts, and API files
- Includes documentation for testing the release process on forks
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| hack/release.sh | New bash script to automate release preparation including version updates and changelog generation |
| .github/workflows/release.yaml | Main release workflow that builds, tags, and publishes releases with PyPI OIDC publishing |
| .github/workflows/check-release.yaml | PR validation workflow to ensure version consistency before merging |
| .github/workflows/build-and-push-images.yaml | Updated to support workflow_dispatch and publish on manual triggers |
| .github/workflows/publish-helm-charts.yaml | Updated to support workflow_dispatch for manual chart publishing |
| .github/workflows/template-publish-image/action.yaml | Added tagging support for workflow_dispatch events |
| .github/workflows/check-pr-title.yaml | Added area/release label to skip PR title checks for release PRs |
| cliff.toml | Configuration for git-cliff changelog generation tool |
| Makefile | Added release target to invoke release.sh |
| docs/release/RELEASE_TESTING.md | Comprehensive guide for testing the release process on forks |
| VERSION | Updated to v2.2.2 |
| manifests/overlays/runtimes/kustomization.yaml | Updated image tags to v2.2.2 |
| manifests/overlays/manager/kustomization.yaml | Updated image tag and configmap version to v2.2.2 |
| manifests/overlays/data-cache/kustomization.yaml | Updated image tag to v2.2.2 |
| manifests/base/runtimes/data-cache/torch_distributed_with_cache.yaml | Updated data-cache image to v99.0.0 (appears to be test data) |
| charts/kubeflow-trainer/Chart.yaml | Updated chart version to 2.2.2 |
| charts/kubeflow-trainer/README.md | Updated version badge to 2.2.2 |
| api/python_api/kubeflow_trainer_api/init.py | Updated Python API version to 2.2.2 |
| CHANGELOG.md | Added changelog entries for v2.2.2 and v99.0.0 (test data) |
| env: | ||
| - name: CACHE_IMAGE | ||
| value: "ghcr.io/kubeflow/trainer/data-cache:latest" | ||
| value: "ghcr.io/kubeflow/trainer/data-cache:v99.0.0" |
There was a problem hiding this comment.
This image tag version v99.0.0 appears to be test data that should not be included in a production release. The version should match the release version v2.2.2 (or v2.2.0 based on the PR title).
| value: "ghcr.io/kubeflow/trainer/data-cache:v99.0.0" | |
| value: "ghcr.io/kubeflow/trainer/data-cache:v2.2.2" |
| ## [v99.0.0](https://github.com/kubeflow/trainer/releases/tag/v99.0.0) (2026-02-25) | ||
|
|
||
| This is Kubeflow Trainer v99.0.0 release. | ||
|
|
||
| ```bash | ||
| kubectl apply --server-side -k "https://github.com/kubeflow/trainer.git/manifests/overlays/manager?ref=v99.0.0" | ||
| kubectl apply --server-side -k "https://github.com/kubeflow/trainer.git/manifests/overlays/runtimes?ref=v99.0.0" | ||
| ``` | ||
|
|
||
| You can now install controller manager with Helm charts 🚀 | ||
|
|
||
| ```bash | ||
| helm install kubeflow-trainer oci://ghcr.io/kubeflow/charts/kubeflow-trainer --version 99.0.0 | ||
| ``` | ||
|
|
||
| For more information, please see [the Kubeflow Trainer docs](https://www.kubeflow.org/docs/components/trainer/overview/) | ||
| ### 🚀 Features | ||
|
|
||
| - feat(ci): Add release automation with OIDC PyPI publishing (@krishna-kg732) | ||
| - feat: Add the manager field to the podTemplateOverride object ([#3020](https://github.com/kubeflow/trainer/pull/3020) by @kaisoz) | ||
| - feat(runtimes): add support for ClusterTrainingRuntimes in Helm chart ([#3124](https://github.com/kubeflow/trainer/pull/3124) by @khushiiagrawal) | ||
| - feat(docs): KEP-2598 XGBoost Runtime for Trainer V2 ([#3118](https://github.com/kubeflow/trainer/pull/3118) by @Krishna-kg732) | ||
| - feat: add production-ready MNIST example for PyTorch ([#3063](https://github.com/kubeflow/trainer/pull/3063) by @Snehadas2005) | ||
| - feat(examples): add torch.compile to PyTorch local examples ([#3076](https://github.com/kubeflow/trainer/pull/3076) by @Ishtiyaque-Alam) | ||
| - feat(api): Fix immutability of the TrainJob APIs ([#3157](https://github.com/kubeflow/trainer/pull/3157) by @andreyvelich) | ||
| - feat(runtimes): Use JobSet VolumeClaimPolicies APIs for LLM Runtimes ([#3150](https://github.com/kubeflow/trainer/pull/3150) by @andreyvelich) | ||
| - feat: replaced vm runner with test gpu arc from cncf ([#3067](https://github.com/kubeflow/trainer/pull/3067) by @jaiakash) | ||
| - feat(runtimes): Add JAX training runtime ([#3151](https://github.com/kubeflow/trainer/pull/3151) by @kaisoz) | ||
| - feat(docs): Add AGENTS.md and Copilot instructions ([#3121](https://github.com/kubeflow/trainer/pull/3121) by @andreyvelich) | ||
| - feat: add scaffolding for feature gates ([#3102](https://github.com/kubeflow/trainer/pull/3102) by @robert-bell) | ||
| - feat(cache): add Helm chart configuration for data_cache ([#3080](https://github.com/kubeflow/trainer/pull/3080) by @khushiiagrawal) | ||
| - feat(docs): KEP-2779: Track TrainJob progress and expose training metrics ([#2905](https://github.com/kubeflow/trainer/pull/2905) by @robert-bell) | ||
| - feat(api): Add securityContext support to PodTemplateSpecOverride in TrainJob ([#3066](https://github.com/kubeflow/trainer/pull/3066) by @Sanskarzz) | ||
| - feat: add VERSION file ([#3077](https://github.com/kubeflow/trainer/pull/3077) by @milinddethe15) | ||
| - feat: KEP 2841 Flux Policy to support Flux Framework ([#2909](https://github.com/kubeflow/trainer/pull/2909) by @vsoch) | ||
| - feat: only update k8s dependencies via patches ([#2969](https://github.com/kubeflow/trainer/pull/2969) by @kannon92) | ||
| - feat: add dependabot to trainer repo ([#2930](https://github.com/kubeflow/trainer/pull/2930) by @kannon92) | ||
| - feat(docs): Add changelog for Kubeflow Trainer v2.1.0 ([#2921](https://github.com/kubeflow/trainer/pull/2921) by @andreyvelich) | ||
| - feat(cache): KEP-2655: Adding default runtime with cache and example ([#2923](https://github.com/kubeflow/trainer/pull/2923) by @akshaychitneni) | ||
| - feat: Adding local execution example notebook ([#2907](https://github.com/kubeflow/trainer/pull/2907) by @Fiona-Waters) | ||
| - feat(cache): KEP-2655 - Supporting readiness probes on cache nodes ([#2904](https://github.com/kubeflow/trainer/pull/2904) by @akshaychitneni) | ||
| - feat(docs): Add changelog for Kubeflow Trainer v2.1.0-rc.1 ([#2918](https://github.com/kubeflow/trainer/pull/2918) by @andreyvelich) | ||
| - feat(manifests): Publish Trainer Helm Charts ([#2906](https://github.com/kubeflow/trainer/pull/2906) by @adity1raut) | ||
| - feat(initializer): add s3 model and dataset initializers ([#2728](https://github.com/kubeflow/trainer/pull/2728) by @rudeigerc) | ||
| - feat(docs): Add changelog for Kubeflow Trainer v2.1.0-rc.0 ([#2902](https://github.com/kubeflow/trainer/pull/2902) by @andreyvelich) | ||
|
|
||
| ### 🐛 Bug Fixes | ||
|
|
||
| - fix: make release.sh executable (@krishna-kg732) | ||
| - fix(ci): correct duplicate step name in `test-go.yaml` ([#3202](https://github.com/kubeflow/trainer/pull/3202) by @puwun) | ||
| - fix: align torchao with torch 2.9.1 to fix GPU e2e failure ([#3203](https://github.com/kubeflow/trainer/pull/3203) by @Goku2099) | ||
| - fix: Defer kubernetes imports to method level for use with local mode ([#3167](https://github.com/kubeflow/trainer/pull/3167) by @Fiona-Waters) | ||
| - fix: service account test filename ([#3153](https://github.com/kubeflow/trainer/pull/3153) by @aniketpati1121) | ||
| - fix(manifests): Remove jobset and lws patches from kustomize deployment ([#3141](https://github.com/kubeflow/trainer/pull/3141) by @yosri-brh) | ||
| - fix: enable read-only root filesystem for trainer manager ([#3119](https://github.com/kubeflow/trainer/pull/3119) by @Goku2099) | ||
| - fix: fix resourcePerNode override not applied with Volcano scheduler ([#2982](https://github.com/kubeflow/trainer/pull/2982) by @sksingh2005) | ||
| - fix(operator): Prevent JobSet recreation when its TTL has expired ([#3013](https://github.com/kubeflow/trainer/pull/3013) by @astefanutti) | ||
| - fix: add `appVersion` field to Helm chart for Kubeflow Trainer ([#3044](https://github.com/kubeflow/trainer/pull/3044) by @milinddethe15) | ||
| - fix(manifests): fix Prometheus metrics port mismatch ([#3056](https://github.com/kubeflow/trainer/pull/3056) by @ChughShilpa) | ||
| - fix(manifests): Fix RBAC for ClusterTrainingRuntime Access ([#3022](https://github.com/kubeflow/trainer/pull/3022) by @andreyvelich) | ||
| - fix(ci): Fix kube-api-linter install ([#3023](https://github.com/kubeflow/trainer/pull/3023) by @astefanutti) | ||
| - fix(ci): Fix new contributors GH actions workflow lint errors ([#3024](https://github.com/kubeflow/trainer/pull/3024) by @astefanutti) | ||
| - fix(operator): Use Patch to update TrainJob status ([#3009](https://github.com/kubeflow/trainer/pull/3009) by @astefanutti) | ||
| - fix(examples): Fix SSL certificate error for local MNIST example ([#2971](https://github.com/kubeflow/trainer/pull/2971) by @astefanutti) | ||
| - fix(ci): Fix the Kubeflow SDK installation with Docker ([#2926](https://github.com/kubeflow/trainer/pull/2926) by @andreyvelich) | ||
| - fix(manifests): Remove the default tag from the controller image ([#2916](https://github.com/kubeflow/trainer/pull/2916) by @andreyvelich) | ||
| - fix(manifests): Fix Helm charts image name ([#2915](https://github.com/kubeflow/trainer/pull/2915) by @andreyvelich) | ||
| - fix(manifests): Fix boolean values defaulting in Helm charts ([#2913](https://github.com/kubeflow/trainer/pull/2913) by @astefanutti) | ||
| - fix(runtimes): Update pip version in the MLX runtime ([#2908](https://github.com/kubeflow/trainer/pull/2908) by @andreyvelich) | ||
|
|
||
| ### ⚙️ Miscellaneous Tasks | ||
|
|
||
| - chore(deps): bump futures from 0.3.31 to 0.3.32 in /pkg/data_cache ([#3214](https://github.com/kubeflow/trainer/pull/3214) by @dependabot[bot]) | ||
| - chore: migrate to a10.2 gpu for gpu e2e ([#3220](https://github.com/kubeflow/trainer/pull/3220) by @jaiakash) | ||
| - chore(deps): bump deepspeed from 0.18.5 to 0.18.6 in /cmd/runtimes/deepspeed ([#3212](https://github.com/kubeflow/trainer/pull/3212) by @dependabot[bot]) | ||
| - chore(deps): bump transformers from 4.57.6 to 5.2.0 in /cmd/runtimes/deepspeed ([#3210](https://github.com/kubeflow/trainer/pull/3210) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.57 to 4.5.59 in /pkg/data_cache/test ([#3206](https://github.com/kubeflow/trainer/pull/3206) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <1.4,>=0.27.0 to >=0.27.0,<1.5 in /cmd/initializers/dataset ([#3194](https://github.com/kubeflow/trainer/pull/3194) by @dependabot[bot]) | ||
| - chore(deps): bump the kubernetes group with 7 updates ([#3204](https://github.com/kubeflow/trainer/pull/3204) by @dependabot[bot]) | ||
| - chore(deps): bump mlx[cuda] from 0.30.5 to 0.30.6 in /cmd/runtimes/mlx ([#3196](https://github.com/kubeflow/trainer/pull/3196) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <1.4,>=0.27.0 to >=0.27.0,<1.5 in /cmd/initializers/model ([#3198](https://github.com/kubeflow/trainer/pull/3198) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.30.5 to 0.30.6 in /cmd/runtimes/mlx ([#3195](https://github.com/kubeflow/trainer/pull/3195) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.56 to 4.5.57 in /pkg/data_cache/test ([#3193](https://github.com/kubeflow/trainer/pull/3193) by @dependabot[bot]) | ||
| - chore(deps): bump sigs.k8s.io/structured-merge-diff/v6 from 6.3.2-0.20260122202528-d9cc6641c482 to 6.3.2 in the kubernetes group ([#3190](https://github.com/kubeflow/trainer/pull/3190) by @dependabot[bot]) | ||
| - chore(deps): bump arrow-flight from 57.2.0 to 57.3.0 in /pkg/data_cache/test ([#3192](https://github.com/kubeflow/trainer/pull/3192) by @dependabot[bot]) | ||
| - chore(deps): bump tonic from 0.14.2 to 0.14.3 in /pkg/data_cache/test ([#3163](https://github.com/kubeflow/trainer/pull/3163) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.47.0 to 0.48.0 in the golang group ([#3191](https://github.com/kubeflow/trainer/pull/3191) by @dependabot[bot]) | ||
| - chore: Add comprehensive unit tests for Config API ([#2893](https://github.com/kubeflow/trainer/pull/2893) by @kapil27) | ||
| - chore(docs): added kubecon 2025 trainer talk ([#3187](https://github.com/kubeflow/trainer/pull/3187) by @jaiakash) | ||
| - chore(deps): bump mlx[cuda] from 0.30.3 to 0.30.5 in /cmd/runtimes/mlx ([#3162](https://github.com/kubeflow/trainer/pull/3162) by @dependabot[bot]) | ||
| - chore(docs): Create symlink for CLAUDE.md ([#3182](https://github.com/kubeflow/trainer/pull/3182) by @andreyvelich) | ||
| - chore(deps): bump time from 0.3.44 to 0.3.47 in /pkg/data_cache ([#3180](https://github.com/kubeflow/trainer/pull/3180) by @dependabot[bot]) | ||
| - chore: changed latest to dev in trainer manifests ([#3146](https://github.com/kubeflow/trainer/pull/3146) by @sameerdattav) | ||
| - chore(deps): bump deepspeed from 0.18.4 to 0.18.5 in /cmd/runtimes/deepspeed ([#3161](https://github.com/kubeflow/trainer/pull/3161) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/gomega from 1.39.0 to 1.39.1 ([#3159](https://github.com/kubeflow/trainer/pull/3159) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.54 to 4.5.56 in /pkg/data_cache/test ([#3160](https://github.com/kubeflow/trainer/pull/3160) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.5 to 2.28.1 ([#3158](https://github.com/kubeflow/trainer/pull/3158) by @dependabot[bot]) | ||
| - chore(deps): bump bytes from 1.11.0 to 1.11.1 in /pkg/data_cache ([#3170](https://github.com/kubeflow/trainer/pull/3170) by @dependabot[bot]) | ||
| - chore(deps): bump bytes from 1.11.0 to 1.11.1 in /pkg/data_cache/test ([#3169](https://github.com/kubeflow/trainer/pull/3169) by @dependabot[bot]) | ||
| - chore: Nominate @akshaychitneni as Kubeflow Trainer reviewer ([#3149](https://github.com/kubeflow/trainer/pull/3149) by @andreyvelich) | ||
| - chore(docs): Update Trainer README with Data Cache and MPI use-cases ([#3142](https://github.com/kubeflow/trainer/pull/3142) by @andreyvelich) | ||
| - chore(deps): bump nvidia/cuda from 13.1.0-devel-ubuntu22.04 to 13.1.1-devel-ubuntu22.04 in /cmd/runtimes/deepspeed ([#3131](https://github.com/kubeflow/trainer/pull/3131) by @dependabot[bot]) | ||
| - chore(deps): bump nvidia/cuda from 13.1.0-devel-ubuntu22.04 to 13.1.1-devel-ubuntu22.04 in /cmd/runtimes/mlx ([#3129](https://github.com/kubeflow/trainer/pull/3129) by @dependabot[bot]) | ||
| - chore(deps): Bump JobSet v0.11.0 and LWS v0.8.0 ([#3144](https://github.com/kubeflow/trainer/pull/3144) by @andreyvelich) | ||
| - chore(deps): bump tower from 0.5.2 to 0.5.3 in /pkg/data_cache ([#3137](https://github.com/kubeflow/trainer/pull/3137) by @dependabot[bot]) | ||
| - chore(deps): bump rust from 1.92-bullseye to 1.93-bullseye in /cmd/data_cache ([#3132](https://github.com/kubeflow/trainer/pull/3132) by @dependabot[bot]) | ||
| - chore(deps): Bump Go 1.25, k8s v1.35, and controller-runtime v0.23.1 ([#3127](https://github.com/kubeflow/trainer/pull/3127) by @andreyvelich) | ||
| - chore(deps): bump mlx-lm from 0.30.4 to 0.30.5 in /cmd/runtimes/mlx ([#3134](https://github.com/kubeflow/trainer/pull/3134) by @dependabot[bot]) | ||
| - chore(deps): bump tokio from 1.48.0 to 1.49.0 in /pkg/data_cache ([#3138](https://github.com/kubeflow/trainer/pull/3138) by @dependabot[bot]) | ||
| - chore: Expose trainer API version via public ConfigMap ([#3083](https://github.com/kubeflow/trainer/pull/3083) by @sameerdattav) | ||
| - chore: use named ports for manager deployment and service ([#3100](https://github.com/kubeflow/trainer/pull/3100) by @Goku2099) | ||
| - chore(docs): Add Trainer v2.1 release news to the README ([#3117](https://github.com/kubeflow/trainer/pull/3117) by @andreyvelich) | ||
| - chore(deps): bump datasets from 4.4.2 to 4.5.0 in /cmd/runtimes/deepspeed ([#3105](https://github.com/kubeflow/trainer/pull/3105) by @dependabot[bot]) | ||
| - chore(deps): bump datasets from 4.4.2 to 4.5.0 in /cmd/runtimes/mlx ([#3108](https://github.com/kubeflow/trainer/pull/3108) by @dependabot[bot]) | ||
| - chore(deps): bump mlx[cuda] from 0.30.1 to 0.30.3 in /cmd/runtimes/mlx ([#3107](https://github.com/kubeflow/trainer/pull/3107) by @dependabot[bot]) | ||
| - chore(deps): bump transformers from 4.57.3 to 4.57.6 in /cmd/runtimes/deepspeed ([#3106](https://github.com/kubeflow/trainer/pull/3106) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.30.2 to 0.30.4 in /cmd/runtimes/mlx ([#3109](https://github.com/kubeflow/trainer/pull/3109) by @dependabot[bot]) | ||
| - chore: fix `make helm-lint` ([#3103](https://github.com/kubeflow/trainer/pull/3103) by @robert-bell) | ||
| - chore(runtimes): Bump Torch to 2.9.1 version ([#3093](https://github.com/kubeflow/trainer/pull/3093) by @andreyvelich) | ||
| - chore(deps): bump axum from 0.7.9 to 0.8.8 in /pkg/data_cache ([#3072](https://github.com/kubeflow/trainer/pull/3072) by @dependabot[bot]) | ||
| - chore(deps): bump tonic from 0.12.3 to 0.14.2 in /pkg/data_cache/test ([#3054](https://github.com/kubeflow/trainer/pull/3054) by @dependabot[bot]) | ||
| - chore(deps): bump tower from 0.4.13 to 0.5.2 in /pkg/data_cache ([#3074](https://github.com/kubeflow/trainer/pull/3074) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <1.2,>=0.27.0 to >=0.27.0,<1.4 in /cmd/initializers/dataset ([#3090](https://github.com/kubeflow/trainer/pull/3090) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.53 to 4.5.54 in /pkg/data_cache/test ([#3070](https://github.com/kubeflow/trainer/pull/3070) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/gomega from 1.38.3 to 1.39.0 ([#3085](https://github.com/kubeflow/trainer/pull/3085) by @dependabot[bot]) | ||
| - chore(deps): bump tokio from 1.48.0 to 1.49.0 in /pkg/data_cache/test ([#3069](https://github.com/kubeflow/trainer/pull/3069) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <1.2,>=0.27.0 to >=0.27.0,<1.4 in /cmd/initializers/model ([#3091](https://github.com/kubeflow/trainer/pull/3091) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.30.0 to 0.30.2 in /cmd/runtimes/mlx ([#3089](https://github.com/kubeflow/trainer/pull/3089) by @dependabot[bot]) | ||
| - chore(deps): bump deepspeed from 0.18.3 to 0.18.4 in /cmd/runtimes/deepspeed ([#3088](https://github.com/kubeflow/trainer/pull/3088) by @dependabot[bot]) | ||
| - chore(deps): bump arrow-flight from 57.1.0 to 57.2.0 in /pkg/data_cache/test ([#3087](https://github.com/kubeflow/trainer/pull/3087) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.3 to 2.27.5 ([#3086](https://github.com/kubeflow/trainer/pull/3086) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.46.0 to 0.47.0 in the golang group ([#3084](https://github.com/kubeflow/trainer/pull/3084) by @dependabot[bot]) | ||
| - chore(deps): bump mlx[cuda] from 0.30.0 to 0.30.1 in /cmd/runtimes/mlx ([#3053](https://github.com/kubeflow/trainer/pull/3053) by @dependabot[bot]) | ||
| - chore(deps): bump tracing from 0.1.41 to 0.1.44 in /pkg/data_cache/test ([#3051](https://github.com/kubeflow/trainer/pull/3051) by @dependabot[bot]) | ||
| - chore(deps): bump arrow-flight from 55.2.0 to 57.1.0 in /pkg/data_cache/test ([#3055](https://github.com/kubeflow/trainer/pull/3055) by @dependabot[bot]) | ||
| - chore(deps): bump datasets from 4.4.1 to 4.4.2 in /cmd/runtimes/mlx ([#3052](https://github.com/kubeflow/trainer/pull/3052) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.28.4 to 0.30.0 in /cmd/runtimes/mlx ([#3050](https://github.com/kubeflow/trainer/pull/3050) by @dependabot[bot]) | ||
| - chore(deps): bump datasets from 4.4.1 to 4.4.2 in /cmd/runtimes/deepspeed ([#3049](https://github.com/kubeflow/trainer/pull/3049) by @dependabot[bot]) | ||
| - chore(deps): bump bincode from 2.0.1 to 3.0.0 in /pkg/data_cache/test ([#3048](https://github.com/kubeflow/trainer/pull/3048) by @dependabot[bot]) | ||
| - chore(deps): bump sigs.k8s.io/kind from 0.30.0 to 0.31.0 in the kubernetes group ([#3047](https://github.com/kubeflow/trainer/pull/3047) by @dependabot[bot]) | ||
| - chore(cache): Fixing vulnerabilites in data_cache ([#3045](https://github.com/kubeflow/trainer/pull/3045) by @akshaychitneni) | ||
| - chore(deps): bump transformers from 4.57.2 to 4.57.3 in /cmd/runtimes/deepspeed ([#3031](https://github.com/kubeflow/trainer/pull/3031) by @dependabot[bot]) | ||
| - chore(deps): bump nvidia/cuda from 13.0.2-devel-ubuntu22.04 to 13.1.0-devel-ubuntu22.04 in /cmd/runtimes/mlx ([#3036](https://github.com/kubeflow/trainer/pull/3036) by @dependabot[bot]) | ||
| - chore(deps): bump the kubernetes group with 6 updates ([#3035](https://github.com/kubeflow/trainer/pull/3035) by @dependabot[bot]) | ||
| - chore(deps): bump actions/upload-artifact from 5 to 6 ([#3038](https://github.com/kubeflow/trainer/pull/3038) by @dependabot[bot]) | ||
| - chore(deps): bump nvidia/cuda from 13.0.2-devel-ubuntu22.04 to 13.1.0-devel-ubuntu22.04 in /cmd/runtimes/deepspeed ([#3037](https://github.com/kubeflow/trainer/pull/3037) by @dependabot[bot]) | ||
| - chore(deps): bump rust from 1.91-bullseye to 1.92-bullseye in /cmd/data_cache ([#3040](https://github.com/kubeflow/trainer/pull/3040) by @dependabot[bot]) | ||
| - chore(deps): bump deepspeed from 0.18.2 to 0.18.3 in /cmd/runtimes/deepspeed ([#3039](https://github.com/kubeflow/trainer/pull/3039) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.28.3 to 0.28.4 in /cmd/runtimes/mlx ([#3029](https://github.com/kubeflow/trainer/pull/3029) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/ginkgo/v2 from 2.27.2 to 2.27.3 ([#3026](https://github.com/kubeflow/trainer/pull/3026) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/gomega from 1.38.2 to 1.38.3 ([#3027](https://github.com/kubeflow/trainer/pull/3027) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.45.0 to 0.46.0 in the golang group ([#3025](https://github.com/kubeflow/trainer/pull/3025) by @dependabot[bot]) | ||
| - chore: Add welcome workflow for new contributors ([#3017](https://github.com/kubeflow/trainer/pull/3017) by @ryanHwH20) | ||
| - chore(operator): Remove Unstructured objects caching ([#3010](https://github.com/kubeflow/trainer/pull/3010) by @astefanutti) | ||
| - chore(examples): Add device to local process MNIST training example ([#3006](https://github.com/kubeflow/trainer/pull/3006) by @astefanutti) | ||
| - chore(examples): Use DDP in local container MNIST training example ([#3007](https://github.com/kubeflow/trainer/pull/3007) by @astefanutti) | ||
| - chore(deps): bump bytes from 1.10.1 to 1.11.0 in /pkg/data_cache ([#3001](https://github.com/kubeflow/trainer/pull/3001) by @dependabot[bot]) | ||
| - chore(deps): bump go.uber.org/zap from 1.27.0 to 1.27.1 ([#2998](https://github.com/kubeflow/trainer/pull/2998) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.52 to 4.5.53 in /pkg/data_cache/test ([#3004](https://github.com/kubeflow/trainer/pull/3004) by @dependabot[bot]) | ||
| - chore(deps): bump arrow-flight from 57.0.0 to 57.1.0 in /pkg/data_cache/test ([#3003](https://github.com/kubeflow/trainer/pull/3003) by @dependabot[bot]) | ||
| - chore(deps): bump transformers from 4.57.1 to 4.57.2 in /cmd/runtimes/deepspeed ([#3002](https://github.com/kubeflow/trainer/pull/3002) by @dependabot[bot]) | ||
| - chore(deps): bump actions/checkout from 5 to 6 ([#3000](https://github.com/kubeflow/trainer/pull/3000) by @dependabot[bot]) | ||
| - chore(deps): bump mlx[cuda] from 0.29.4 to 0.30.0 in /cmd/runtimes/mlx ([#2999](https://github.com/kubeflow/trainer/pull/2999) by @dependabot[bot]) | ||
| - chore(deps): bump sigs.k8s.io/structured-merge-diff/v6 from 6.3.0 to 6.3.1 in the kubernetes group ([#2996](https://github.com/kubeflow/trainer/pull/2996) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/open-policy-agent/cert-controller from 0.14.0 to 0.15.0 ([#2997](https://github.com/kubeflow/trainer/pull/2997) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.44.0 to 0.45.0 ([#2994](https://github.com/kubeflow/trainer/pull/2994) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.43.0 to 0.44.0 in the golang group ([#2985](https://github.com/kubeflow/trainer/pull/2985) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.51 to 4.5.52 in /pkg/data_cache/test ([#2990](https://github.com/kubeflow/trainer/pull/2990) by @dependabot[bot]) | ||
| - chore(deps): bump async-trait from 0.1.88 to 0.1.89 in /pkg/data_cache ([#2988](https://github.com/kubeflow/trainer/pull/2988) by @dependabot[bot]) | ||
| - chore(deps): bump pytorch/pytorch from 2.9.0-cuda12.8-cudnn9-runtime to 2.9.1-cuda12.8-cudnn9-runtime in /cmd/trainers/torchtune ([#2986](https://github.com/kubeflow/trainer/pull/2986) by @dependabot[bot]) | ||
| - chore(deps): bump the kubernetes group with 6 updates ([#2984](https://github.com/kubeflow/trainer/pull/2984) by @dependabot[bot]) | ||
| - chore(deps): bump bytes from 1.10.1 to 1.11.0 in /pkg/data_cache/test ([#2989](https://github.com/kubeflow/trainer/pull/2989) by @dependabot[bot]) | ||
| - chore(deps): bump mlx-lm from 0.26.3 to 0.28.3 in /cmd/runtimes/mlx ([#2950](https://github.com/kubeflow/trainer/pull/2950) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <0.28,>=0.27.0 to >=0.27.0,<1.2 in /cmd/initializers/model ([#2957](https://github.com/kubeflow/trainer/pull/2957) by @dependabot[bot]) | ||
| - chore(deps): update huggingface-hub requirement from <0.28,>=0.27.0 to >=0.27.0,<1.2 in /cmd/initializers/dataset ([#2955](https://github.com/kubeflow/trainer/pull/2955) by @dependabot[bot]) | ||
| - chore(deps): bump datasets from 4.0.0 to 4.4.1 in /cmd/runtimes/deepspeed ([#2944](https://github.com/kubeflow/trainer/pull/2944) by @dependabot[bot]) | ||
| - chore(deps): bump mlx[cuda] from 0.28.0 to 0.29.3 in /cmd/runtimes/mlx ([#2956](https://github.com/kubeflow/trainer/pull/2956) by @dependabot[bot]) | ||
| - chore(deps): bump transformers from 4.55.0 to 4.57.1 in /cmd/runtimes/deepspeed ([#2961](https://github.com/kubeflow/trainer/pull/2961) by @dependabot[bot]) | ||
| - chore(deps): bump deepspeed from 0.17.4 to 0.18.2 in /cmd/runtimes/deepspeed ([#2954](https://github.com/kubeflow/trainer/pull/2954) by @dependabot[bot]) | ||
| - chore(deps): bump nvidia/cuda from 12.8.1-devel-ubuntu22.04 to 13.0.2-devel-ubuntu22.04 in /cmd/runtimes/deepspeed ([#2939](https://github.com/kubeflow/trainer/pull/2939) by @dependabot[bot]) | ||
| - chore(deps): bump pytorch/pytorch from 2.7.1-cuda12.8-cudnn9-runtime to 2.9.0-cuda12.8-cudnn9-runtime in /cmd/trainers/torchtune ([#2934](https://github.com/kubeflow/trainer/pull/2934) by @dependabot[bot]) | ||
| - chore(deps): bump datasets from 4.0.0 to 4.4.1 in /cmd/runtimes/mlx ([#2943](https://github.com/kubeflow/trainer/pull/2943) by @dependabot[bot]) | ||
| - chore(deps): bump nvidia/cuda from 12.8.1-devel-ubuntu22.04 to 13.0.2-devel-ubuntu22.04 in /cmd/runtimes/mlx ([#2932](https://github.com/kubeflow/trainer/pull/2932) by @dependabot[bot]) | ||
| - chore(deps): bump mpi4py from 4.1.0 to 4.1.1 in /cmd/runtimes/deepspeed ([#2958](https://github.com/kubeflow/trainer/pull/2958) by @dependabot[bot]) | ||
| - chore(deps): bump bincode from 1.3.3 to 2.0.1 in /pkg/data_cache/test ([#2949](https://github.com/kubeflow/trainer/pull/2949) by @dependabot[bot]) | ||
| - chore(deps): bump tonic from 0.12.3 to 0.14.2 in /pkg/data_cache/test ([#2962](https://github.com/kubeflow/trainer/pull/2962) by @dependabot[bot]) | ||
| - chore(deps): bump serde from 1.0.225 to 1.0.228 in /pkg/data_cache/test ([#2959](https://github.com/kubeflow/trainer/pull/2959) by @dependabot[bot]) | ||
| - chore(deps): bump actions/checkout from 4 to 5 ([#2974](https://github.com/kubeflow/trainer/pull/2974) by @dependabot[bot]) | ||
| - chore(deps): bump serde from 1.0.215 to 1.0.228 in /pkg/data_cache ([#2978](https://github.com/kubeflow/trainer/pull/2978) by @dependabot[bot]) | ||
| - chore(deps): bump actions/setup-go from 5 to 6 ([#2975](https://github.com/kubeflow/trainer/pull/2975) by @dependabot[bot]) | ||
| - chore(deps): bump amannn/action-semantic-pull-request from 5.5.3 to 6.1.1 ([#2976](https://github.com/kubeflow/trainer/pull/2976) by @dependabot[bot]) | ||
| - chore(deps): bump arrow-flight from 55.2.0 to 57.0.0 in /pkg/data_cache/test ([#2973](https://github.com/kubeflow/trainer/pull/2973) by @dependabot[bot]) | ||
| - chore(deps): bump actions/setup-python from 5 to 6 ([#2977](https://github.com/kubeflow/trainer/pull/2977) by @dependabot[bot]) | ||
| - chore(deps): bump python from 3.11-slim-bookworm to 3.14-slim-bookworm in /cmd/initializers/model ([#2951](https://github.com/kubeflow/trainer/pull/2951) by @dependabot[bot]) | ||
| - chore(deps): bump python from 3.11-slim-bookworm to 3.14-slim-bookworm in /cmd/initializers/dataset ([#2941](https://github.com/kubeflow/trainer/pull/2941) by @dependabot[bot]) | ||
| - chore(deps): bump sentencepiece from 0.2.0 to 0.2.1 in /cmd/runtimes/deepspeed ([#2948](https://github.com/kubeflow/trainer/pull/2948) by @dependabot[bot]) | ||
| - chore(deps): bump tokio from 1.47.1 to 1.48.0 in /pkg/data_cache/test ([#2963](https://github.com/kubeflow/trainer/pull/2963) by @dependabot[bot]) | ||
| - chore(deps): bump clap from 4.5.43 to 4.5.51 in /pkg/data_cache/test ([#2965](https://github.com/kubeflow/trainer/pull/2965) by @dependabot[bot]) | ||
| - chore(deps): bump tokio from 1.46.1 to 1.48.0 in /pkg/data_cache ([#2966](https://github.com/kubeflow/trainer/pull/2966) by @dependabot[bot]) | ||
| - chore(deps): bump aquasecurity/trivy-action from 0.28.0 to 0.33.1 ([#2947](https://github.com/kubeflow/trainer/pull/2947) by @dependabot[bot]) | ||
| - chore(deps): bump actions/stale from 9 to 10 ([#2942](https://github.com/kubeflow/trainer/pull/2942) by @dependabot[bot]) | ||
| - chore(deps): bump mpioperator/base from v0.6.0 to v0.7.0 in /cmd/runtimes/deepspeed ([#2938](https://github.com/kubeflow/trainer/pull/2938) by @dependabot[bot]) | ||
| - chore(deps): bump golang from 1.24 to 1.25 in /cmd/trainer-controller-manager ([#2935](https://github.com/kubeflow/trainer/pull/2935) by @dependabot[bot]) | ||
| - chore(deps): bump actions/github-script from 7 to 8 ([#2937](https://github.com/kubeflow/trainer/pull/2937) by @dependabot[bot]) | ||
| - chore(deps): bump actions/upload-artifact from 4 to 5 ([#2936](https://github.com/kubeflow/trainer/pull/2936) by @dependabot[bot]) | ||
| - chore(deps): bump mpioperator/base from v0.6.0 to v0.7.0 in /cmd/runtimes/mlx ([#2933](https://github.com/kubeflow/trainer/pull/2933) by @dependabot[bot]) | ||
| - chore(deps): bump rust from 1.85-bullseye to 1.91-bullseye in /cmd/data_cache ([#2931](https://github.com/kubeflow/trainer/pull/2931) by @dependabot[bot]) | ||
| - chore(deps): bump github/codeql-action from 3 to 4 ([#2953](https://github.com/kubeflow/trainer/pull/2953) by @dependabot[bot]) | ||
| - chore(deps): bump github.com/onsi/ginkgo/v2 from 2.25.3 to 2.27.2 ([#2952](https://github.com/kubeflow/trainer/pull/2952) by @dependabot[bot]) | ||
| - chore(deps): bump sigs.k8s.io/controller-runtime from 0.22.3 to 0.22.4 in the kubernetes group ([#2940](https://github.com/kubeflow/trainer/pull/2940) by @dependabot[bot]) | ||
| - chore(deps): bump golang.org/x/crypto from 0.41.0 to 0.43.0 in the golang group ([#2945](https://github.com/kubeflow/trainer/pull/2945) by @dependabot[bot]) | ||
| - chore(operator): Use SSA throughout runtime framework ([#2877](https://github.com/kubeflow/trainer/pull/2877) by @astefanutti) | ||
|
|
||
|
|
||
| ### New Contributors | ||
| * @puwun made their first contribution in [#3202](https://github.com/kubeflow/trainer/pull/3202) | ||
| * @khushiiagrawal made their first contribution in [#3124](https://github.com/kubeflow/trainer/pull/3124) | ||
| * @Snehadas2005 made their first contribution in [#3063](https://github.com/kubeflow/trainer/pull/3063) | ||
| * @Ishtiyaque-Alam made their first contribution in [#3076](https://github.com/kubeflow/trainer/pull/3076) | ||
| * @sameerdattav made their first contribution in [#3146](https://github.com/kubeflow/trainer/pull/3146) | ||
| * @Fiona-Waters made their first contribution in [#3167](https://github.com/kubeflow/trainer/pull/3167) | ||
| * @aniketpati1121 made their first contribution in [#3153](https://github.com/kubeflow/trainer/pull/3153) | ||
| * @yosri-brh made their first contribution in [#3141](https://github.com/kubeflow/trainer/pull/3141) | ||
| * @robert-bell made their first contribution in [#3102](https://github.com/kubeflow/trainer/pull/3102) | ||
| * @sksingh2005 made their first contribution in [#2982](https://github.com/kubeflow/trainer/pull/2982) | ||
| * @Sanskarzz made their first contribution in [#3066](https://github.com/kubeflow/trainer/pull/3066) | ||
| * @ChughShilpa made their first contribution in [#3056](https://github.com/kubeflow/trainer/pull/3056) | ||
| * @vsoch made their first contribution in [#2909](https://github.com/kubeflow/trainer/pull/2909) | ||
| * @ryanHwH20 made their first contribution in [#3017](https://github.com/kubeflow/trainer/pull/3017) | ||
| * @kannon92 made their first contribution in [#2969](https://github.com/kubeflow/trainer/pull/2969) | ||
| * @adity1raut made their first contribution in [#2906](https://github.com/kubeflow/trainer/pull/2906) | ||
|
|
||
|
|
There was a problem hiding this comment.
The CHANGELOG contains entries for both v2.2.2 and v99.0.0 releases. The v99.0.0 entry appears to be test data from testing the release automation and should be removed before merging this PR.
|
|
||
| ### ⚙️ Miscellaneous Tasks | ||
|
|
||
| - chore(release): Release v99.0.0 (@krishna-kg732) |
There was a problem hiding this comment.
The CHANGELOG includes a commit referencing v99.0.0 in the v2.2.2 release notes. This test data should be removed as it will confuse users about the actual release history.
| - chore(release): Release v99.0.0 (@krishna-kg732) |
| @@ -1 +1 @@ | |||
| v2.1.0 No newline at end of file | |||
| v2.2.2 No newline at end of file | |||
There was a problem hiding this comment.
The PR title indicates this is a release for v2.2.0, but all version files have been updated to v2.2.2. This mismatch suggests either the PR title is incorrect or the wrong version was used in the release script.
| docker run --rm -u "$(id -u):$(id -g)" -v "$ABSOLUTE_REPO_ROOT:/app" \ | ||
| -e "GITHUB_TOKEN=${GITHUB_TOKEN:-}" -w /app \ | ||
| "ghcr.io/orhun/git-cliff/git-cliff:latest" --unreleased --tag "$TAG" -o - > "$TEMP_FILE" |
There was a problem hiding this comment.
The docker run invocation uses the mutable image tag ghcr.io/orhun/git-cliff/git-cliff:latest and passes the GITHUB_TOKEN environment into the container, which creates a concrete supply-chain risk: if that third-party image is ever compromised, it can exfiltrate the token and tamper with release artifacts. This script will be used with real PATs or GitHub tokens, so a malicious or hijacked latest image would immediately gain access to repository and workflow permissions. Pin this image to an immutable digest or vetted version tag and avoid passing GITHUB_TOKEN into the container unless the token’s scope is strictly minimized and required for the operation.
Test release