Skip to content

Commit cd83013

Browse files
tzifudziknabben
authored andcommitted
blog: Add Windows Operational Readiness blog article
1 parent 4ba73b9 commit cd83013

File tree

1 file changed

+156
-0
lines changed

1 file changed

+156
-0
lines changed
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
layout: blog
3+
title: "Introducing the Windows Operational Readiness Specification"
4+
date: 2024-04-03
5+
slug: intro-windows-ops-readiness
6+
---
7+
8+
**Authors:** Jay Vyas (Tesla), Amim Knabben (Broadcom), and Tatenda Zifudzi (AWS)
9+
10+
11+
Since Windows support [graduated to stable](/blog/2019/03/25/kubernetes-1-14-release-announcement/)
12+
with Kubernetes 1.14 in 2019, the capability to run Windows workloads has been much
13+
appreciated by the end user community. The level of and availability of Windows workload
14+
support has consistently been a major differentiator for Kubernetes distributions used by
15+
large enterprises. However, with more Windows workloads being migrated to Kubernetes
16+
and new Windows features being continuously released, it became challenging to test
17+
Windows worker nodes in an effective and standardized way.
18+
19+
The Kubernetes project values the ability to certify conformance without requiring a
20+
closed-source license for a certified distribution or service that has no intention
21+
of offering Windows.
22+
23+
Some notable examples brought to the attention of SIG Windows were:
24+
25+
- An issue with load balancer source address ranges functionality not operating correctly on
26+
Windows nodes, detailed in a GitHub issue:
27+
[kubernetes/kubernetes#120033](https://github.com/kubernetes/kubernetes/issues/120033).
28+
- Reports of functionality issues with Windows features, such as
29+
[GMSA](https://learn.microsoft.com/en-us/windows-server/security/group-managed-service-accounts/group-managed-service-accounts-overview) not working with containerd,
30+
discussed in [microsoft/Windows-Containers#44](https://github.com/microsoft/Windows-Containers/issues/44).
31+
- Challenges developing networking policy tests that could objectively evaluate
32+
Container Network Interface (CNI) plugins across different operating system configurations,
33+
as discussed in [kubernetes/kubernetes#97751](https://github.com/kubernetes/kubernetes/issues/97751).
34+
35+
SIG Windows therefore recognized the need for a tailored solution to ensure Windows
36+
nodes' operational readiness *before* their deployment into production environments.
37+
Thus, the idea to develop a [Windows Operational Readiness Specification](https://kep.k8s.io/2578)
38+
was born.
39+
40+
## Can’t we just run the official Conformance tests?
41+
42+
The Kubernetes project contains a set of [conformance tests](https://www.cncf.io/training/certification/software-conformance/#how),
43+
which are standardized tests designed to ensure that a Kubernetes cluster meets
44+
the required Kubernetes specifications.
45+
46+
However, these tests were originally defined at a time when Linux was the *only*
47+
operating system compatible with Kubernetes, and thus, they were not easily
48+
extendable for use with Windows. Given that Windows workloads, despite their
49+
importance, account for a smaller portion of the Kubernetes community, it was
50+
important to ensure that the primary conformance suite relied upon by many
51+
Kubernetes distributions to certify Linux conformance, didn't become encumbered
52+
with Windows specific features or enhancements such as GMSA or multi-operating
53+
system kube-proxy behavior.
54+
55+
Therefore, since there was a specialized need for Windows conformance testing,
56+
SIG Windows went down the path of offering Windows specific conformance tests
57+
through the Windows Operational Readiness Specification.
58+
59+
## Can’t we just run the Kubernetes end-to-end test suite?
60+
61+
In the Linux world, tools such as [Sonobuoy](https://sonobuoy.io/) simplify execution of the
62+
conformance suite, relieving users from needing to be aware of Kubernetes'
63+
compilation paths or the semantics of [Ginkgo](https://onsi.github.io/ginkgo) tags.
64+
65+
Regarding needing to compile the Kubernetes tests, we realized that Windows
66+
users might similarly find the process of compiling and running the Kubernetes
67+
e2e suite from scratch similarly undesirable, hence, there was a clear need to
68+
provide a user-friendly, "push-button" solution that is ready to go. Moreover,
69+
regarding Ginkgo tags, applying conformance tests to Windows nodes through a set
70+
of [Ginkgo](https://onsi.github.io/ginkgo/) tags would also be burdensome for
71+
any user, including Linux enthusiasts or experienced Windows system admins alike.
72+
73+
To bridge the gap and give users a straightforward way to confirm their clusters
74+
support a variety of features, the Kubernetes SIG for Windows found it necessary to
75+
therefore create the Windows Operational Readiness application. This application
76+
written in Go, simplifies the process to run the necessary Windows specific tests
77+
while delivering results in a clear, accessible format.
78+
79+
This initiative has been a collaborative effort, with contributions from different
80+
cloud providers and platforms, including Amazon, Microsoft, SUSE, and Broadcom.
81+
82+
## A closer look at the Windows Operational Readiness Specification {#specification}
83+
84+
The Windows Operational Readiness specification specifically targets and executes
85+
tests found within the Kubernetes repository in a more user-friendly way than
86+
simply targeting [Ginkgo](https://onsi.github.io/ginkgo/) tags. It introduces a
87+
structured test suite that is split into sets of core and extended tests, with
88+
each set of tests containing categories directed at testing a specific area of
89+
testing, such as networking. Core tests target fundamental and critical
90+
functionalities that Windows nodes should support as defined by the Kubernetes
91+
specification. On the other hand, extended tests cover more complex features,
92+
more aligned with diving deeper into Windows-specific capabilities such as
93+
integrations with Active Directory. These goal of these tests is to be extensive,
94+
covering a wide array of Windows-specific capabilities to ensure compatibility
95+
with a diverse set of workloads and configurations, extending beyond basic
96+
requirements. Below is the current list of categories.
97+
98+
| Category Name | Category Description |
99+
|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
100+
| `Core.Network` | Tests minimal networking functionality (ability to access pod-by-pod IP.) |
101+
| `Core.Storage` | Tests minimal storage functionality, (ability to mount a hostPath storage volume.) |
102+
| `Core.Scheduling` | Tests minimal scheduling functionality, (ability to schedule a pod with CPU limits.) |
103+
| `Core.Concurrent` | Tests minimal concurrent functionality, (the ability of a node to handle traffic to multiple pods concurrently.) |
104+
| `Extend.HostProcess` | Tests features related to Windows HostProcess pod functionality. |
105+
| `Extend.ActiveDirectory` | Tests features related to Active Directory functionality. |
106+
| `Extend.NetworkPolicy` | Tests features related to Network Policy functionality. |
107+
| `Extend.Network` | Tests advanced networking functionality, (ability to support IPv6) |
108+
| `Extend.Worker` | Tests features related to Windows worker node functionality, (ability for nodes to access TCP and UDP services in the same cluster) |
109+
110+
## How to conduct operational readiness tests for Windows nodes
111+
112+
To run the Windows Operational Readiness test suite, refer to the test suite's
113+
[`README`](https://github.com/kubernetes-sigs/windows-operational-readiness/blob/main/README.md), which explains how to set it up and run it. The test suite offers
114+
flexibility in how you can execute tests, either using a compiled binary or a
115+
Sonobuoy plugin. You also have the choice to run the tests against the entire
116+
test suite or by specifying a list of categories. Cloud providers have the
117+
choice of uploading their conformance results, enhancing transparency and reliability.
118+
119+
Once you have checked out that code, you can run a test. For example, this sample
120+
command runs the tests from the `Core.Concurrent` category:
121+
122+
```shell
123+
./op-readiness --kubeconfig $KUBE_CONFIG --category Core.Concurrent
124+
```
125+
126+
As a contributor to Kubernetes, if you want to test your changes against a specific pull
127+
request using the Windows Operational Readiness Specification, use the following bot
128+
command in the new pull request.
129+
130+
```shell
131+
/test operational-tests-capz-windows-2019
132+
```
133+
134+
## Looking ahead
135+
136+
We’re looking to improve our curated list of Windows-specific tests by adding
137+
new tests to the Kubernetes repository and also identifying existing test cases
138+
that can be targetted. The long term goal for the specification is to continually
139+
enhance test coverage for Windows worker nodes and improve the robustness of
140+
Windows support, facilitating a seamless experience across diverse cloud
141+
environments. We also have plans to integrate the Windows Operational Readiness
142+
tests into the official Kubernetes conformance suite.
143+
144+
If you are interested in helping us out, please reach out to us! We welcome help
145+
in any form, from giving once-off feedback to making a code contribution,
146+
to having long-term owners to help us drive changes. The Windows Operational
147+
Readiness specification is owned by the SIG Windows team. You can reach out
148+
to the team on the [Kubernetes Slack workspace](https://slack.k8s.io/) **#sig-windows**
149+
channel. You can also explore the [Windows Operational Readiness test suite](https://github.com/kubernetes-sigs/windows-operational-readiness/#readme)
150+
and make contributions directly to the GitHub repository.
151+
152+
Special thanks to Kulwant Singh (AWS), Pramita Gautam Rana (VMWare), Xinqi Li
153+
(Google) for their help in making notable contributions to the specification. Additionally,
154+
appreciation goes to James Sturtevant (Microsoft), Mark Rossetti (Microsoft),
155+
Claudiu Belu (Cloudbase Solutions) and Aravindh Puthiyaparambil
156+
(Softdrive Technologies Group Inc.) from the SIG Windows team for their guidance and support.

0 commit comments

Comments
 (0)