Skip to content

Commit c4f291c

Browse files
committed
Fix meeting minutes
1 parent 9ee8fdd commit c4f291c

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

β€Žwebsite/community/steering/product/2025-03-scaling-recommender.mdβ€Ž

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,14 @@ publishdate: 2025-03-17
88
- πŸ—“οΈ **Presentation:** 2025-03-17, 10:00 - 11:00 CET
99
- πŸŽ₯ **Recording:** [click here](https://youtu.be/u4-fWwKITuM)
1010
- <details closed><summary>πŸ“ <b>Meeting Minutes</b></summary>
11+
1112
- Madhav (and Tarun on CA inefficiencies) presented the recommender proposal.
1213
- Questions were raised about what issues could realistically be changed upstream (if the committers would approve) and what cannot be changed (fundamental issues).
1314
- One fundamental issue is that CA looks only at one node group at a time and therefore only considers filtering, never scoring (there is nothing to score since only nodes in one node group are analyzed). Consequently, all follow-up issues cannot be addressed either, like zone imbalance or sub-optimal recommendations.
1415
- The concern was raised that while the recommender is being developed, the community will progress and implement, e.g., resource reservations. However, feedback is not all positive/this proposal is critised to not solve the complex requirements for modern GPU workloads and pod-(gang-)scheduling. Also, because the suggested recommender will directly leverage the kube-scheduler, there will be reduced (sometimes no) need to duplicate this kind of logic in the recommender – for this feature or new upcoming features.
1516
- The concern was raised that virtualizing the API server and ETCD may require significant effort and whether we can contribute upstream changes to the kube-scheduler so that it returns recommendations instead. However, it seems unlikely to achieve that because it would complicate the kube-scheduler further (mixing in recommendations), make available the machine options to pick from (today, it only knows of/looks at existing nodes), and break the one-pod-at-a-time scheduling principle it follows today (CA and the recommender need to look at all pending pods to make a sensible recommendation). Furthermore, virtualizing the API server and ETCD is probably not much work (as seen in the PoC) because we need to implement β€œonly” the kube-scheduler required API surface and hold the data in memory. CA went another way, but in the end, the data is held also there in memory.
1617
- The proposal was made to present the scaling recommender in SIG Auto-Scaling to get feedback on whether the proposal makes sense, independent of whether anyone but us wants to implement it.
18+
1719
</details>
1820
- πŸ‘¨β€βš–οΈ **Decisions:**
1921
- Investment was approved, considering the many issues listed in the motivational document.

β€Žwebsite/community/steering/technical/2025-03-observability-2.0.mdβ€Ž

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,11 @@ publishdate: 2025-03-20
88
- πŸ—“οΈ **Presentation:** 2025-03-20, 15:00 - 16:00 CET
99
- πŸŽ₯ **Recording:** [click here](https://youtu.be/rH9EDAsxrbg)
1010
- <details closed><summary>πŸ“ <b>Meeting Minutes</b></summary>
11+
1112
- Nikolai presented the plans for Observability 2.0.
1213
- There was general consensus that this is a huge improvement.
1314
- It was proposed to file individual GEPs for the details. Nikolai sees at least three coming up, maybe more (recording 1:01:41)
15+
1416
</details>
1517
- πŸ‘¨β€βš–οΈ **Decisions:**
1618
- The decision was taken to implement Observability 2.0.

0 commit comments

Comments
Β (0)