Skip to content

Commit c8e0d11

Browse files
authored
Merge pull request #44560 from sftim/20231228_merge_main
Merge main branch into dev-1.30
2 parents 06e29d6 + 4cb247b commit c8e0d11

File tree

632 files changed

+12943
-2638
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

632 files changed

+12943
-2638
lines changed

content/en/blog/_posts/2021-10-05-nsa-cisa-hardening.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ and are in no way a direct recommendation from the Kubernetes community or autho
1717

1818
USA's National Security Agency (NSA) and the Cybersecurity and Infrastructure
1919
Security Agency (CISA)
20-
released, "[Kubernetes Hardening Guidance](https://media.defense.gov/2021/Aug/03/2002820425/-1/-1/1/CTR_KUBERNETES%20HARDENING%20GUIDANCE.PDF)"
20+
released Kubernetes Hardening Guidance
2121
on August 3rd, 2021. The guidance details threats to Kubernetes environments
2222
and provides secure configuration guidance to minimize risk.
2323

@@ -29,6 +29,14 @@ _Note_: This blog post is not a substitute for reading the guide. Reading the pu
2929
guidance is recommended before proceeding as the following content is
3030
complementary.
3131

32+
{{% pageinfo color="primary" %}}
33+
**Update, November 2023:**
34+
35+
The National Security Agency (NSA) and the Cybersecurity and Infrastructure Security Agency (CISA) released the 1.0 version of the Kubernetes hardening guide in August 2021 and updated it based on industry feedback in March 2022 (version 1.1).
36+
37+
The most recent version of the Kubernetes hardening guidance was released in August 2022 with corrections and clarifications. Version 1.2 outlines a number of recommendations for [hardening Kubernetes clusters](https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF).
38+
{{% /pageinfo %}}
39+
3240
## Introduction and Threat Model
3341

3442
Note that the threats identified as important by the NSA/CISA, or the intended audience of this guidance, may be different from the threats that other enterprise users of Kubernetes consider important. This section
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
layout: blog
3+
title: "Contextual logging in Kubernetes 1.29: Better troubleshooting and enhanced logging"
4+
slug: contextual-logging-in-kubernetes-1-29
5+
date: 2023-12-20T09:30:00-08:00
6+
canonicalUrl: https://www.kubernetes.dev/blog/2023/12/20/contextual-logging/
7+
---
8+
9+
**Authors**: [Mengjiao Liu](https://github.com/mengjiao-liu/) (DaoCloud), [Patrick Ohly](https://github.com/pohly) (Intel)
10+
11+
On behalf of the [Structured Logging Working Group](https://github.com/kubernetes/community/blob/master/wg-structured-logging/README.md)
12+
and [SIG Instrumentation](https://github.com/kubernetes/community/tree/master/sig-instrumentation#readme),
13+
we are pleased to announce that the contextual logging feature
14+
introduced in Kubernetes v1.24 has now been successfully migrated to
15+
two components (kube-scheduler and kube-controller-manager)
16+
as well as some directories. This feature aims to provide more useful logs
17+
for better troubleshooting of Kubernetes and to empower developers to enhance Kubernetes.
18+
19+
## What is contextual logging?
20+
21+
[Contextual logging](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/3077-contextual-logging)
22+
is based on the [go-logr](https://github.com/go-logr/logr#a-minimal-logging-api-for-go) API.
23+
The key idea is that libraries are passed a logger instance by their caller
24+
and use that for logging instead of accessing a global logger.
25+
The binary decides the logging implementation, not the libraries.
26+
The go-logr API is designed around structured logging and supports attaching
27+
additional information to a logger.
28+
29+
This enables additional use cases:
30+
31+
- The caller can attach additional information to a logger:
32+
- [WithName](<https://pkg.go.dev/github.com/go-logr/logr#Logger.WithName>) adds a "logger" key with the names concatenated by a dot as value
33+
- [WithValues](<https://pkg.go.dev/github.com/go-logr/logr#Logger.WithValues>) adds key/value pairs
34+
35+
When passing this extended logger into a function, and the function uses it
36+
instead of the global logger, the additional information is then included
37+
in all log entries, without having to modify the code that generates the log entries.
38+
This is useful in highly parallel applications where it can become hard to identify
39+
all log entries for a certain operation, because the output from different operations gets interleaved.
40+
41+
- When running unit tests, log output can be associated with the current test.
42+
Then, when a test fails, only the log output of the failed test gets shown by go test.
43+
That output can also be more verbose by default because it will not get shown for successful tests.
44+
Tests can be run in parallel without interleaving their output.
45+
46+
One of the design decisions for contextual logging was to allow attaching a logger as value to a `context.Context`.
47+
Since the logger encapsulates all aspects of the intended logging for the call,
48+
it is *part* of the context, and not just *using* it. A practical advantage is that many APIs
49+
already have a `ctx` parameter or can add one. This provides additional advantages, like being able to
50+
get rid of `context.TODO()` calls inside the functions.
51+
52+
## How to use it
53+
54+
The contextual logging feature is alpha starting from Kubernetes v1.24,
55+
so it requires the `ContextualLogging` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled.
56+
If you want to test the feature while it is alpha, you need to enable this feature gate
57+
on the `kube-controller-manager` and the `kube-scheduler`.
58+
59+
For the `kube-scheduler`, there is one thing to note, in addition to enabling
60+
the `ContextualLogging` feature gate, instrumentation also depends on log verbosity.
61+
To avoid slowing down the scheduler with the logging instrumentation for contextual logging added for 1.29,
62+
it is important to choose carefully when to add additional information:
63+
- At `-v3` or lower, only `WithValues("pod")` is used once per scheduling cycle.
64+
This has the intended effect that all log messages for the cycle include the pod information.
65+
Once contextual logging is GA, "pod" key/value pairs can be removed from all log calls.
66+
- At `-v4` or higher, richer log entries get produced where `WithValues` is also used for the node (when applicable)
67+
and `WithName` is used for the current operation and plugin.
68+
69+
Here is an example that demonstrates the effect:
70+
> I1113 08:43:37.029524 87144 default_binder.go:53] "Attempting to bind pod to node" **logger="Bind.DefaultBinder"** **pod**="kube-system/coredns-69cbfb9798-ms4pq" **node**="127.0.0.1"
71+
72+
The immediate benefit is that the operation and plugin name are visible in `logger`.
73+
`pod` and `node` are already logged as parameters in individual log calls in `kube-scheduler` code.
74+
Once contextual logging is supported by more packages outside of `kube-scheduler`,
75+
they will also be visible there (for example, client-go). Once it is GA,
76+
log calls can be simplified to avoid repeating those values.
77+
78+
In `kube-controller-manager`, `WithName` is used to add the user-visible controller name to log output,
79+
for example:
80+
81+
> I1113 08:43:29.284360 87141 graph_builder.go:285] "garbage controller monitor not synced: no monitors" **logger="garbage-collector-controller"**
82+
83+
The `logger=”garbage-collector-controller”` was added by the `kube-controller-manager` core
84+
when instantiating that controller and appears in all of its log entries - at least as long as the code
85+
that it calls supports contextual logging. Further work is needed to convert shared packages like client-go.
86+
87+
## Performance impact
88+
89+
Supporting contextual logging in a package, i.e. accepting a logger from a caller, is cheap.
90+
No performance impact was observed for the `kube-scheduler`. As noted above,
91+
adding `WithName` and `WithValues` needs to be done more carefully.
92+
93+
In Kubernetes 1.29, enabling contextual logging at production verbosity (`-v3` or lower)
94+
caused no measurable slowdown for the `kube-scheduler` and is not expected for the `kube-controller-manager` either.
95+
At debug levels, a 28% slowdown for some test cases is still reasonable given that the resulting logs make debugging easier.
96+
For details, see the [discussion around promoting the feature to beta](https://github.com/kubernetes/enhancements/pull/4219#issuecomment-1807811995).
97+
98+
## Impact on downstream users
99+
Log output is not part of the Kubernetes API and changes regularly in each release,
100+
whether it is because developers work on the code or because of the ongoing conversion
101+
to structured and contextual logging.
102+
103+
If downstream users have dependencies on specific logs,
104+
they need to be aware of how this change affects them.
105+
106+
## Further reading
107+
108+
- Read the [Contextual Logging in Kubernetes 1.24](https://www.kubernetes.dev/blog/2022/05/25/contextual-logging/) article.
109+
- Read the [KEP-3077: contextual logging](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/3077-contextual-logging).
110+
111+
## Get involved
112+
113+
If you're interested in getting involved, we always welcome new contributors to join us.
114+
Contextual logging provides a fantastic opportunity for you to contribute to Kubernetes development and make a meaningful impact.
115+
By joining [Structured Logging WG](https://github.com/kubernetes/community/tree/master/wg-structured-logging),
116+
you can actively participate in the development of Kubernetes and make your first contribution.
117+
It's a great way to learn and engage with the community while gaining valuable experience.
118+
119+
We encourage you to explore the repository and familiarize yourself with the ongoing discussions and projects.
120+
It's a collaborative environment where you can exchange ideas, ask questions, and work together with other contributors.
121+
122+
If you have any questions or need guidance, don't hesitate to reach out to us
123+
and you can do so on our [public Slack channel](https://kubernetes.slack.com/messages/wg-structured-logging).
124+
If you're not already part of that Slack workspace, you can visit [https://slack.k8s.io/](https://slack.k8s.io/)
125+
for an invitation.
126+
127+
We would like to express our gratitude to all the contributors who provided excellent reviews,
128+
shared valuable insights, and assisted in the implementation of this feature (in alphabetical order):
129+
130+
- Aldo Culquicondor ([alculquicondor](https://github.com/alculquicondor))
131+
- Andy Goldstein ([ncdc](https://github.com/ncdc))
132+
- Feruzjon Muyassarov ([fmuyassarov](https://github.com/fmuyassarov))
133+
- Freddie ([freddie400](https://github.com/freddie400))
134+
- JUN YANG ([yangjunmyfm192085](https://github.com/yangjunmyfm192085))
135+
- Kante Yin ([kerthcet](https://github.com/kerthcet))
136+
- Kiki ([carlory](https://github.com/carlory))
137+
- Lucas Severo Alve ([knelasevero](https://github.com/knelasevero))
138+
- Maciej Szulik ([soltysh](https://github.com/soltysh))
139+
- Mengjiao Liu ([mengjiao-liu](https://github.com/mengjiao-liu))
140+
- Naman Lakhwani ([Namanl2001](https://github.com/Namanl2001))
141+
- Oksana Baranova ([oxxenix](https://github.com/oxxenix))
142+
- Patrick Ohly ([pohly](https://github.com/pohly))
143+
- songxiao-wang87 ([songxiao-wang87](https://github.com/songxiao-wang87))
144+
- Tim Allclai ([tallclair](https://github.com/tallclair))
145+
- ZhangYu ([Octopusjust](https://github.com/Octopusjust))
146+
- Ziqi Zhao ([fatsheep9146](https://github.com/fatsheep9146))
147+
- Zac ([249043822](https://github.com/249043822))

content/en/case-studies/vsco/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
<h2>Challenge</h2>
2020

21-
<p>After moving from <a href="https://www.rackspace.com/">Rackspace</a> to <a href="https://aws.amazon.com/">AWS</a> in 2015, <a href="https://vsco.co/">VSCO</a> began building <a href="https://nodejs.org/en/">Node.js</a> and <a href="https://golang.org/">Go</a> microservices in addition to running its <a href="http://php.net/">PHP monolith</a>. The team containerized the microservices using <a href="https://www.docker.com/">Docker</a>, but "they were all in separate groups of <a href="https://aws.amazon.com/ec2/">EC2</a> instances that were dedicated per service," says Melinda Lu, Engineering Manager for the Machine Learning Team. Adds Naveen Gattu, Senior Software Engineer on the Community Team: "That yielded a lot of wasted resources. We started looking for a way to consolidate and be more efficient in the AWS EC2 instances."</p>
21+
<p>After moving from <a href="https://www.rackspace.com/">Rackspace</a> to <a href="https://aws.amazon.com/">AWS</a> in 2015, <a href="https://vsco.co/">VSCO</a> began building <a href="https://nodejs.org/en/">Node.js</a> and <a href="https://go.dev/">Go</a> microservices in addition to running its <a href="http://php.net/">PHP monolith</a>. The team containerized the microservices using <a href="https://www.docker.com/">Docker</a>, but "they were all in separate groups of <a href="https://aws.amazon.com/ec2/">EC2</a> instances that were dedicated per service," says Melinda Lu, Engineering Manager for the Machine Learning Team. Adds Naveen Gattu, Senior Software Engineer on the Community Team: "That yielded a lot of wasted resources. We started looking for a way to consolidate and be more efficient in the AWS EC2 instances."</p>
2222

2323
<h2>Solution</h2>
2424

content/en/docs/concepts/cluster-administration/addons.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,9 @@ installation instructions. The list does not try to be exhaustive.
7979
Pods and non-Kubernetes environments with visibility and security monitoring.
8080
* [Romana](https://github.com/romana) is a Layer 3 networking solution for pod
8181
networks that also supports the [NetworkPolicy](/docs/concepts/services-networking/network-policies/) API.
82+
* [Spiderpool](https://github.com/spidernet-io/spiderpool) is an underlay and RDMA
83+
networking solution for Kubernetes. Spiderpool is supported on bare metal, virtual machines,
84+
and public cloud environments.
8285
* [Weave Net](https://www.weave.works/docs/net/latest/kubernetes/kube-addon/)
8386
provides networking and network policy, will carry on working on both sides
8487
of a network partition, and does not require an external database.

content/en/docs/concepts/configuration/manage-resources-containers.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,13 @@ runs on a single-core, dual-core, or 48-core machine.
116116

117117
{{< note >}}
118118
Kubernetes doesn't allow you to specify CPU resources with a precision finer than
119-
`1m`. Because of this, it's useful to specify CPU units less than `1.0` or `1000m` using
120-
the milliCPU form; for example, `5m` rather than `0.005`.
119+
`1m` or `0.001` CPU. To avoid accidentally using an invalid CPU quantity, it's useful to specify CPU units using the milliCPU form
120+
instead of the decimal form when using less than 1 CPU unit.
121+
122+
For example, you have a Pod that uses `5m` or `0.005` CPU and would like to decrease
123+
its CPU resources. By using the decimal form, it's harder to spot that `0.0005` CPU
124+
is an invalid value, while by using the milliCPU form, it's easier to spot that
125+
`0.5m` is an invalid value.
121126
{{< /note >}}
122127

123128
### Memory resource units {#meaning-of-memory}

content/en/docs/concepts/security/pod-security-standards.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,10 @@ fail validation.
272272
<li><code>net.ipv4.tcp_syncookies</code></li>
273273
<li><code>net.ipv4.ping_group_range</code></li>
274274
<li><code>net.ipv4.ip_local_reserved_ports</code> (since Kubernetes 1.27)</li>
275+
<li><code>net.ipv4.tcp_keepalive_time</code> (since Kubernetes 1.29)</li>
276+
<li><code>net.ipv4.tcp_fin_timeout</code> (since Kubernetes 1.29)</li>
277+
<li><code>net.ipv4.tcp_keepalive_intvl</code> (since Kubernetes 1.29)</li>
278+
<li><code>net.ipv4.tcp_keepalive_probes</code> (since Kubernetes 1.29)</li>
275279
</ul>
276280
</td>
277281
</tr>

0 commit comments

Comments
 (0)