Skip to content

Commit c0bdce3

Browse files
authored
Merge pull request #48065 from chanieljdan/merged-main-dev-1.32
Merge main branch into dev 1.32
2 parents f9610cd + 46f78dd commit c0bdce3

File tree

82 files changed

+1570
-402
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+1570
-402
lines changed
Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
---
2+
layout: blog
3+
title: "Spotlight on SIG Scheduling"
4+
slug: sig-scheduling-spotlight-2024
5+
canonicalUrl: https://www.kubernetes.dev/blog/2024/09/24/sig-scheduling-spotlight-2024
6+
date: 2024-09-24
7+
author: "Arvind Parekh"
8+
---
9+
10+
In this SIG Scheduling spotlight we talked with [Kensei Nakada](https://github.com/sanposhiho/), an
11+
approver in SIG Scheduling.
12+
13+
## Introductions
14+
15+
**Arvind:** **Hello, thank you for the opportunity to learn more about SIG Scheduling! Would you
16+
like to introduce yourself and tell us a bit about your role, and how you got involved with
17+
Kubernetes?**
18+
19+
**Kensei**: Hi, thanks for the opportunity! I’m Kensei Nakada
20+
([@sanposhiho](https://github.com/sanposhiho/)), a software engineer at
21+
[Tetrate.io](https://tetrate.io/). I have been contributing to Kubernetes in my free time for more
22+
than 3 years, and now I’m an approver of SIG-Scheduling in Kubernetes. Also, I’m a founder/owner of
23+
two SIG subprojects,
24+
[kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator) and
25+
[kube-scheduler-wasm-extension](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension).
26+
27+
## About SIG Scheduling
28+
29+
**AP: That's awesome! You've been involved with the project since a long time. Can you provide a
30+
brief overview of SIG Scheduling and explain its role within the Kubernetes ecosystem?**
31+
32+
**KN**: As the name implies, our responsibility is to enhance scheduling within
33+
Kubernetes. Specifically, we develop the components that determine which Node is the best place for
34+
each Pod. In Kubernetes, our main focus is on maintaining the
35+
[kube-scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/), along
36+
with other scheduling-related components as part of our SIG subprojects.
37+
38+
**AP: I see, got it! That makes me curious--what recent innovations or developments has SIG
39+
Scheduling introduced to Kubernetes scheduling?**
40+
41+
**KN**: From a feature perspective, there have been [several
42+
enhancements](https://kubernetes.io/blog/2023/04/17/fine-grained-pod-topology-spread-features-beta/)
43+
to `PodTopologySpread` recently. `PodTopologySpread` is a relatively new feature in the scheduler,
44+
and we are still in the process of gathering feedback and making improvements.
45+
46+
Most recently, we have been focusing on a new internal enhancement called
47+
[QueueingHint](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/4247-queueinghint/README.md)
48+
which aims to enhance scheduling throughput. Throughput is one of our crucial metrics in
49+
scheduling. Traditionally, we have primarily focused on optimizing the latency of each scheduling
50+
cycle. QueueingHint takes a different approach, optimizing when to retry scheduling, thereby
51+
reducing the likelihood of wasting scheduling cycles.
52+
53+
**A: That sounds interesting! Are there any other interesting topics or projects you are currently
54+
working on within SIG Scheduling?**
55+
56+
**KN**: I’m leading the development of `QueueingHint` which I just shared. Given that it’s a big new
57+
challenge for us, we’ve been facing many unexpected challenges, especially around the scalability,
58+
and we’re trying to solve each of them to eventually enable it by default.
59+
60+
And also, I believe
61+
[kube-scheduler-wasm-extention](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension)
62+
(SIG sub project) that I started last year would be interesting to many people. Kubernetes has
63+
various extensions from many components. Traditionally, extensions are provided via webhooks
64+
([extender](https://github.com/kubernetes/design-proposals-archive/blob/main/scheduling/scheduler_extender.md)
65+
in the scheduler) or Go SDK ([Scheduling
66+
Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) in the
67+
scheduler). However, these come with drawbacks - performance issues with webhooks and the need to
68+
rebuild and replace schedulers with Go SDK, posing difficulties for those seeking to extend the
69+
scheduler but lacking familiarity with it. The project is trying to introduce a new solution to
70+
this general challenge - a [WebAssembly](https://webassembly.org/) based extension. Wasm allows
71+
users to build plugins easily, without worrying about recompiling or replacing their scheduler, and
72+
sidestepping performance concerns.
73+
74+
Through this project, sig-scheduling has been learning valuable insights about WebAssembly's
75+
interaction with large Kubernetes objects. And I believe the experience that we’re gaining should be
76+
useful broadly within the community, beyond sig-scheduling.
77+
78+
**A: Definitely! Now, there are currently 8 subprojects inside SIG Scheduling. Would you like to
79+
talk about them? Are there some interesting contributions by those teams you want to highlight?**
80+
81+
**KN**: Let me pick up three sub projects; Kueue, KWOK and descheduler.
82+
83+
[Kueue](https://github.com/kubernetes-sigs/kueue):
84+
: Recently, many people have been trying to manage batch workloads with Kubernetes, and in 2022,
85+
Kubernetes community founded
86+
[WG-Batch](https://github.com/kubernetes/community/blob/master/wg-batch/README.md) for better
87+
support for such batch workloads in Kubernetes. [Kueue](https://github.com/kubernetes-sigs/kueue)
88+
is a project that takes a crucial role for it. It’s a job queueing controller, deciding when a job
89+
should wait, when a job should be admitted to start, and when a job should be preempted. Kueue aims
90+
to be installed on a vanilla Kubernetes cluster while cooperating with existing matured controllers
91+
(scheduler, cluster-autoscaler, kube-controller-manager, etc).
92+
93+
[KWOK](https://github.com/kubernetes-sigs/kwok):
94+
: KWOK is a component in which you can create a cluster of thousands of Nodes in seconds. It’s
95+
mostly useful for simulation/testing as a lightweight cluster, and actually another SIG sub
96+
project [kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator)
97+
uses KWOK background.
98+
99+
[descheduler](https://github.com/kubernetes-sigs/descheduler):
100+
: Descheduler is a component recreating pods that are running on undesired Nodes. In Kubernetes,
101+
scheduling constraints (`PodAffinity`, `NodeAffinity`, `PodTopologySpread`, etc) are honored only at
102+
Pod schedule, but it’s not guaranteed that the contrtaints are kept being satisfied afterwards.
103+
Descheduler evicts Pods violating their scheduling constraints (or other undesired conditions) so
104+
that they’re recreated and rescheduled.
105+
106+
[Descheduling Framework](https://github.com/kubernetes-sigs/descheduler/blob/master/keps/753-descheduling-framework/README.md).
107+
: One very interesting on-going project, similar to [Scheduling
108+
Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) in the
109+
scheduler, aiming to make descheduling logic extensible and allow maintainers to focus on building
110+
a core engine of descheduler.
111+
112+
**AP: Thank you for letting us know! And I have to ask, what are some of your favorite things about
113+
this SIG?**
114+
115+
**KN**: What I really like about this SIG is how actively engaged everyone is. We come from various
116+
companies and industries, bringing diverse perspectives to the table. Instead of these differences
117+
causing division, they actually generate a wealth of opinions. Each view is respected, and this
118+
makes our discussions both rich and productive.
119+
120+
I really appreciate this collaborative atmosphere, and I believe it has been key to continuously
121+
improving our components over the years.
122+
123+
## Contributing to SIG Scheduling
124+
125+
**AP: Kubernetes is a community-driven project. Any recommendations for new contributors or
126+
beginners looking to get involved and contribute to SIG scheduling? Where should they start?**
127+
128+
**KN**: Let me start with a general recommendation for contributing to any SIG: a common approach is
129+
to look for
130+
[good-first-issue](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
131+
However, you'll soon realize that many people worldwide are trying to contribute to the Kubernetes
132+
repository.
133+
134+
I suggest starting by examining the implementation of a component that interests you. If you have
135+
any questions about it, ask in the corresponding Slack channel (e.g., #sig-scheduling for the
136+
scheduler, #sig-node for kubelet, etc). Once you have a rough understanding of the implementation,
137+
look at issues within the SIG (e.g.,
138+
[sig-scheduling](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3Asig%2Fscheduling)),
139+
where you'll find more unassigned issues compared to good-first-issue ones. You may also want to
140+
filter issues with the
141+
[kind/cleanup](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue++label%3Akind%2Fcleanup+)
142+
label, which often indicates lower-priority tasks and can be starting points.
143+
144+
Specifically for SIG Scheduling, you should first understand the [Scheduling
145+
Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), which is
146+
the fundamental architecture of kube-scheduler. Most of the implementation is found in
147+
[pkg/scheduler](https://github.com/kubernetes/kubernetes/tree/master/pkg/scheduler). I suggest
148+
starting with
149+
[ScheduleOne](https://github.com/kubernetes/kubernetes/blob/0590bb1ac495ae8af2a573f879408e48800da2c5/pkg/scheduler/schedule_one.go#L66)
150+
function and then exploring deeper from there.
151+
152+
Additionally, apart from the main kubernetes/kubernetes repository, consider looking into
153+
sub-projects. These typically have fewer maintainers and offer more opportunities to make a
154+
significant impact. Despite being called "sub" projects, many have a large number of users and a
155+
considerable impact on the community.
156+
157+
And last but not least, remember contributing to the community isn’t just about code. While I
158+
talked a lot about the implementation contribution, there are many ways to contribute, and each one
159+
is valuable. One comment to an issue, one feedback to an existing feature, one review comment in PR,
160+
one clarification on the documentation; every small contribution helps drive the Kubernetes
161+
ecosystem forward.
162+
163+
**AP: Those are some pretty useful tips! And if I may ask, how do you assist new contributors in
164+
getting started, and what skills are contributors likely to learn by participating in SIG
165+
Scheduling?**
166+
167+
**KN**: Our maintainers are available to answer your questions in the #sig-scheduling Slack
168+
channel. By participating, you'll gain a deeper understanding of Kubernetes scheduling and have the
169+
opportunity to collaborate and network with maintainers from diverse backgrounds. You'll learn not
170+
just how to write code, but also how to maintain a large project, design and discuss new features,
171+
address bugs, and much more.
172+
173+
## Future Directions
174+
175+
**AP: What are some Kubernetes-specific challenges in terms of scheduling? Are there any particular
176+
pain points?**
177+
178+
**KN**: Scheduling in Kubernetes can be quite challenging because of the diverse needs of different
179+
organizations with different business requirements. Supporting all possible use cases in
180+
kube-scheduler is impossible. Therefore, extensibility is a key focus for us. A few years ago, we
181+
rearchitected kube-scheduler with [Scheduling
182+
Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), which
183+
offers flexible extensibility for users to implement various scheduling needs through plugins. This
184+
allows maintainers to focus on the core scheduling features and the framework runtime.
185+
186+
Another major issue is maintaining sufficient scheduling throughput. Typically, a Kubernetes cluster
187+
has only one kube-scheduler, so its throughput directly affects the overall scheduling scalability
188+
and, consequently, the cluster's scalability. Although we have an internal performance test
189+
([scheduler_perf](https://github.com/kubernetes/kubernetes/tree/master/test/integration/scheduler_perf)),
190+
unfortunately, we sometimes overlook performance degradation in less common scenarios. It’s
191+
difficult as even small changes, which look irrelevant to performance, can lead to degradation.
192+
193+
**AP: What are some upcoming goals or initiatives for SIG Scheduling? How do you envision the SIG evolving in the future?**
194+
195+
**KN**: Our primary goal is always to build and maintain _extensible_ and _stable_ scheduling
196+
runtime, and I bet this goal will remain unchanged forever.
197+
198+
As already mentioned, extensibility is key to solving the challenge of the diverse needs of
199+
scheduling. Rather than trying to support every different use case directly in kube-scheduler, we
200+
will continue to focus on enhancing extensibility so that it can accommodate various use
201+
cases. [kube-scheduler-wasm-extention](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension)
202+
that I mentioned is also part of this initiative.
203+
204+
Regarding stability, introducing new optimizations like QueueHint is one of our
205+
strategies. Additionally, maintaining throughput is also a crucial goal towards the future. We’re
206+
planning to enhance our throughput monitoring
207+
([ref](https://github.com/kubernetes/kubernetes/issues/124774)), so that we can notice degradation
208+
as much as possible on our own before releasing. But, realistically, we can't cover every possible
209+
scenario. We highly appreciate any attention the community can give to scheduling throughput and
210+
encourage feedback and alerts regarding performance issues!
211+
212+
## Closing Remarks
213+
214+
**AP: Finally, what message would you like to convey to those who are interested in learning more
215+
about SIG Scheduling?**
216+
217+
**KN**: Scheduling is one of the most complicated areas in Kubernetes, and you may find it difficult
218+
at first. But, as I shared earlier, you can find many opportunities for contributions, and many
219+
maintainers are willing to help you understand things. We know your unique perspective and skills
220+
are what makes our open source so powerful :)
221+
222+
Feel free to reach out to us in Slack
223+
([#sig-scheduling](https://kubernetes.slack.com/archives/C09TP78DV)) or
224+
[meetings](https://github.com/kubernetes/community/blob/master/sig-scheduling/README.md#meetings).
225+
I hope this article interests everyone and we can see new contributors!
226+
227+
**AP: Thank you so much for taking the time to do this! I'm confident that many will find this
228+
information invaluable for understanding more about SIG Scheduling and for contributing to the SIG.**

content/en/docs/concepts/architecture/garbage-collection.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,10 @@ to learn more.
9595
### Background cascading deletion {#background-deletion}
9696

9797
In background cascading deletion, the Kubernetes API server deletes the owner
98-
object immediately and the controller cleans up the dependent objects in
99-
the background. By default, Kubernetes uses background cascading deletion unless
98+
object immediately and the garbage collector controller (custom or default)
99+
cleans up the dependent objects in the background.
100+
If a finalizer exists, it ensures that objects are not deleted until all necessary clean-up tasks are completed.
101+
By default, Kubernetes uses background cascading deletion unless
100102
you manually use foreground deletion or choose to orphan the dependent objects.
101103

102104
See [Use background cascading deletion](/docs/tasks/administer-cluster/use-cascading-deletion/#use-background-cascading-deletion)

content/en/docs/concepts/scheduling-eviction/topology-spread-constraints.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -477,8 +477,8 @@ There are some implicit conventions worth noting here:
477477

478478
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
479479

480-
- The scheduler bypasses any nodes that don't have any `topologySpreadConstraints[*].topologyKey`
481-
present. This implies that:
480+
- The scheduler only considers nodes that have all `topologySpreadConstraints[*].topologyKey` present at the same time.
481+
Nodes missing any of these `topologyKeys` are bypassed. This implies that:
482482

483483
1. any Pods located on those bypassed nodes do not impact `maxSkew` calculation - in the
484484
above [example](#example-conflicting-topologyspreadconstraints), suppose the node `node1`

0 commit comments

Comments
 (0)