Skip to content

Commit cfd3cd8

Browse files
committed
address feedback from review
1 parent e478776 commit cfd3cd8

File tree

1 file changed

+30
-18
lines changed

1 file changed

+30
-18
lines changed

wg-batch/annual-report-2024.md

Lines changed: 30 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ We will breakdown our highlights into Sub Projects, KEPs, talks, community adopt
1818

1919
##### Kueue
2020

21-
Kueue has had 5 releases in 2024.
21+
Kueue has had 5 minor releases in 2024.
2222

2323
- [Release 0.6](https://github.com/kubernetes-sigs/kueue/releases/tag/v0.6.0)
2424

@@ -32,19 +32,25 @@ Kueue has had 5 releases in 2024.
3232

3333
In 2024, the kueue community would like to highlight are Topology aware scheduling, MultiKueue, Kueue Dashboard, KueueCtrl, Deployment/Statefulset integration for serving and Fair sharing.
3434

35-
Topology aware scheduling facilitates scheduling of workloads that take in account data center topology. Workloads benefit from using interconnects that are physically close together.
35+
[Topology aware scheduling](https://kueue.sigs.k8s.io/docs/concepts/topology_aware_scheduling/) facilitates scheduling of workloads that take in account data center topology.
36+
Workloads benefit from using interconnects that are physically close together.
3637

37-
MultiKueue provides a way of dispatching batch workloads to worker clusters. Kueue provides multicluster dispatching for popular batch workloads such as Ray, Job, Kubeflow and JobSet. This feature went beta in 0.9.
38+
[MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) provides a way of dispatching batch workloads to worker clusters.
39+
Kueue provides multicluster dispatching for popular batch workloads such as Ray, Job, Kubeflow and JobSet.
40+
This feature went beta in 0.9.
3841

39-
Kueue Dashboards has been a popular ask for Kueue. Users would like to have a visualization representation of queueing and we are happy to announce that a dashboard has been created for Kueue. This went into kueue in late 2024 and a big focus of 2025 will be to harden this for production.
42+
[Kueue Dashboards](https://github.com/kubernetes-sigs/kueue/tree/release-0.10/cmd/experimental/kueue-viz) has been a popular ask for Kueue.
43+
Users would like to have a visualization representation of queueing and we are happy to announce that a dashboard has been created for Kueue.
44+
This went into kueue in late 2024 and a big focus of 2025 will be to harden this for production.
4045

41-
KueueCtrl provides a cli for creating kueue objects. The plugin is hosted in krew and is easily installed as a kueue plugin.
46+
[KueueCtl](https://kueue.sigs.k8s.io/docs/reference/kubectl-kueue/) provides a cli for creating kueue objects.
47+
The plugin is hosted in krew and is easily installed as a kueue plugin.
4248

43-
Deployment/StatefulSet integration provides an avenue for the usage of Kueue for serving workloads. Serving leads to a need for sharing/preemption of model servers that may leverage accelerators. Kueue provides an integration with popular methods of deploying services (Deployment/StatefulSet).
49+
[Deployment](https://kueue.sigs.k8s.io/docs/tasks/run/deployment/) and [StatefulSet](https://kueue.sigs.k8s.io/docs/tasks/run/statefulset/) integration provides an avenue for the usage of Kueue for serving workloads. Serving leads to a need for sharing/preemption of model servers that may leverage accelerators. Kueue provides an integration with popular methods of deploying services (Deployment/StatefulSet).
4450

4551
##### JobSet
4652

47-
Jobset has had 4 release in 2024.
53+
Jobset has had 4 minor releases in 2024.
4854

4955
- [Release 0.4](https://github.com/kubernetes-sigs/jobset/releases/tag/v0.4.0)
5056

@@ -54,14 +60,14 @@ Jobset has had 4 release in 2024.
5460

5561
- [Release 0.7](https://github.com/kubernetes-sigs/jobset/releases/tag/v0.7.0)
5662

57-
A major achievement of JobSet has been the adoption of JobSet as a component for Kubeflow Training Operator V2.
63+
A major achievement of JobSet has been the adoption of JobSet as a component for [Kubeflow Training Operator](https://github.com/kubeflow/training-operator) V2.
5864
There has been a collaborative effort with the Kubeflow community and the batch community to implement the features needed for this integration.
5965

6066
[Metaflow](https://github.com/Netflix/metaflow/pull/1804) has adopted the use of JobSet for distributed ML training.
6167

6268
##### KJob
6369

64-
[KJob](https://github.com/kubernetes-sigs/kjob?tab=readme-ov-file#kjob) has been started to provide a CLI friendly way for users to submit batch jobs.
70+
[KJob](https://github.com/kubernetes-sigs/kjob) has been started to provide a CLI friendly way for users to submit batch jobs.
6571
The HPC/ML community tend to prefer CLI over YAML so the focus was to provide a templated solution for submitting batch jobs.
6672
Another focus of this project is to provide a smooth transition for Slurm users.
6773

@@ -87,32 +93,39 @@ WG-Batch provided a series of kubernetes enhancements that improved the experien
8793
### Talks
8894

8995
- WG-Batch Update at Kubecon
90-
- Authors: Kevin Hannon and Marcin Wielgus
96+
- Speakers: Kevin Hannon and Marcin Wielgus
9197
- Kubecon NA, Salt Lake City
98+
- [Recording](https://www.youtube.com/watch?v=C2ABOEzZTWg&list=PLj6h78yzYM2Pw4mRw4S-1p_xLARMqPkA7&index=283&pp=iAQB)
9299

93100
- Keynote: MultiCluster Batch Jobs Dispatching with Kueue at CERN
94-
- Authors: Ricardo Rocha and Marcin Wielgus
101+
- Speakers: Ricardo Rocha and Marcin Wielgus
95102
- Kubecon NA, Salt Lake City
103+
- [Recording](https://www.youtube.com/watch?v=xMmskWIlktA&list=PLj6h78yzYM2Pw4mRw4S-1p_xLARMqPkA7&index=193&pp=iAQB)
96104

97105
- Multitenancy and Fairness at Scale with Kueue: A Case Study
98-
- Authors: Aldo Culquicondor and Rajat Phull
106+
- Speakers: Aldo Culquicondor and Rajat Phull
99107
- Kubecon NA, Salt Lake City
108+
- [Recording](https://www.youtube.com/watch?v=GYiuTQCvTx8&list=PLj6h78yzYM2Mvqk_mNejD7kbe3tldxxsr&index=5&pp=iAQB)
100109

101110
- Advanced Resource Management for Running AI/ML Workloads with Kueue
102-
- Authors: Michał Woźniak and Yuki Iwai
111+
- Speakers: Michał Woźniak and Yuki Iwai
103112
- Kubecon EU, Paris
113+
- [Recording](https://www.youtube.com/watch?v=6k_8Go3u8Qk)
104114

105115
- Scale Your Batch / Big Data / AI Workloads Beyond the Kubernetes Scheduler
106-
- Authors: Antonin Stefanutti and Anish Asthana
116+
- Speaker: Antonin Stefanutti and Anish Asthana
107117
- KubeCon EU, Paris
118+
- [Recording](https://www.youtube.com/watch?v=Ij5EAnuF-jk&list=PLj6h78yzYM2PWGv34W6w5ssq1b1meRmY7&index=15&pp=iAQB)
108119

109120
- WG-Batch Update
110-
- Author: Marcin Wielgus
121+
- Speaker: Michał Woźniak and Yuki Iwai
111122
- KubeCon EU, Paris
123+
- [Recording](https://www.youtube.com/watch?v=2D2QSzUnS0M&list=PLj6h78yzYM2N8nw1YcqqKveySH6_0VnI0&index=84&pp=iAQB)
112124

113125
- How the Kubernetes Community is Improving Kubernetes for HPC/AI/ML Workloads
114126
- Author: Kevin Hannon
115127
- FOSDEM 2024
128+
- [Recording](https://live.fosdem.org/watch/ua2118)
116129

117130
### Community adoption
118131

@@ -130,9 +143,8 @@ Operational tasks in [wg-governance.md]:
130143
- [x] WG leaders in [sigs.yaml] are accurate and active, and updated if needed
131144
- [x] Meeting notes and recordings for 2024 are linked from [README.md] and updated/uploaded if needed
132145
- [x] Updates provided to sponsoring SIGs in 2024
133-
- WG-Batch Updates at Kubecon EU 2024
134-
- WG-Batch Updates at Kubecon NA 2024
135-
146+
- [WG-Batch Updates at Kubecon EU 2024](https://www.youtube.com/watch?v=2D2QSzUnS0M&list=PLj6h78yzYM2N8nw1YcqqKveySH6_0VnI0&index=84&pp=iAQB)
147+
- [WG-Batch Updates at Kubecon NA 2024](https://www.youtube.com/watch?v=C2ABOEzZTWg&list=PLj6h78yzYM2Pw4mRw4S-1p_xLARMqPkA7&index=283&pp=iAQB)
136148
[wg-governance.md]: https://git.k8s.io/community/committee-steering/governance/wg-governance.md
137149
[README.md]: https://git.k8s.io/community/wg-batch/README.md
138150
[sigs.yaml]: https://git.k8s.io/community/sigs.yaml

0 commit comments

Comments
 (0)