Skip to content

Commit 6030185

Browse files
committed
Merge branch 'main' into feature/docs-tools-kubectl-macos
2 parents 9ec08bf + 770b345 commit 6030185

File tree

11 files changed

+867
-281
lines changed

11 files changed

+867
-281
lines changed

OWNERS_ALIASES

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,6 @@ aliases:
167167
- edsoncelio
168168
- femrtnz
169169
- jcjesus
170-
- rikatz
171170
- stormqueen1990
172171
- yagonobre
173172
sig-docs-pt-reviews: # PR reviews for Portugese content
@@ -176,7 +175,6 @@ aliases:
176175
- femrtnz
177176
- jcjesus
178177
- mrerlison
179-
- rikatz
180178
- stormqueen1990
181179
- yagonobre
182180
sig-docs-vi-owners: # Admins for Vietnamese content

content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ compatible behavior when disabled, and to document how to interact with each oth
4040

4141
This enabled the Kubernetes project to graduate to GA the CPU Manager core component and core CPU allocation algorithms to GA,
4242
while also enabling a new age of experimentation in this area.
43-
In Kubernetes v1.26, the CPU Manager supports [three different policy options](/docs/tasks/administer-cluster/cpu-management-policies.md#static-policy-options):
43+
In Kubernetes v1.26, the CPU Manager supports [three different policy options](/docs/tasks/administer-cluster/cpu-management-policies#static-policy-options):
4444

4545
`full-pcpus-only`
4646
: restrict the CPU Manager core allocation algorithm to full physical cores only, reducing noisy neighbor issues from hardware technologies that allow sharing cores.
Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
---
2+
layout: blog
3+
title: "Kubernetes 1.27: StatefulSet Start Ordinal Simplifies Migration"
4+
date: 2023-04-28
5+
slug: statefulset-start-ordinal
6+
---
7+
8+
**Author**: Peter Schuurman (Google)
9+
10+
Kubernetes v1.26 introduced a new, alpha-level feature for
11+
[StatefulSets](/docs/concepts/workloads/controllers/statefulset/) that controls
12+
the ordinal numbering of Pod replicas. As of Kubernetes v1.27, this feature is
13+
now beta. Ordinals can start from arbitrary
14+
non-negative numbers. This blog post will discuss how this feature can be
15+
used.
16+
17+
## Background
18+
19+
StatefulSets ordinals provide sequential identities for pod replicas. When using
20+
[`OrderedReady` Pod management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management)
21+
Pods are created from ordinal index `0` up to `N-1`.
22+
23+
With Kubernetes today, orchestrating a StatefulSet migration across clusters is
24+
challenging. Backup and restore solutions exist, but these require the
25+
application to be scaled down to zero replicas prior to migration. In today's
26+
fully connected world, even planned application downtime may not allow you to
27+
meet your business goals. You could use
28+
[Cascading Delete](/docs/tutorials/stateful-application/basic-stateful-set/#cascading-delete)
29+
or
30+
[On Delete](/docs/tutorials/stateful-application/basic-stateful-set/#on-delete)
31+
to migrate individual pods, however this is error prone and tedious to manage.
32+
You lose the self-healing benefit of the StatefulSet controller when your Pods
33+
fail or are evicted.
34+
35+
Kubernetes v1.26 enables a StatefulSet to be responsible for a range of ordinals
36+
within a range {0..N-1} (the ordinals 0, 1, ... up to N-1).
37+
With it, you can scale down a range
38+
{0..k-1} in a source cluster, and scale up the complementary range {k..N-1}
39+
in a destination cluster, while maintaining application availability. This
40+
enables you to retain *at most one* semantics (meaning there is at most one Pod
41+
with a given identity running in a StatefulSet) and
42+
[Rolling Update](/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update)
43+
behavior when orchestrating a migration across clusters.
44+
45+
## Why would I want to use this feature?
46+
47+
Say you're running your StatefulSet in one cluster, and need to migrate it out
48+
to a different cluster. There are many reasons why you would need to do this:
49+
* **Scalability**: Your StatefulSet has scaled too large for your cluster, and
50+
has started to disrupt the quality of service for other workloads in your
51+
cluster.
52+
* **Isolation**: You're running a StatefulSet in a cluster that is accessed
53+
by multiple users, and namespace isolation isn't sufficient.
54+
* **Cluster Configuration**: You want to move your StatefulSet to a different
55+
cluster to use some environment that is not available on your current
56+
cluster.
57+
* **Control Plane Upgrades**: You want to move your StatefulSet to a cluster
58+
running an upgraded control plane, and can't handle the risk or downtime of
59+
in-place control plane upgrades.
60+
61+
## How do I use it?
62+
63+
Enable the `StatefulSetStartOrdinal` feature gate on a cluster, and create a
64+
StatefulSet with a customized `.spec.ordinals.start`.
65+
66+
## Try it out
67+
68+
In this demo, I'll use the new mechanism to migrate a
69+
StatefulSet from one Kubernetes cluster to another. The
70+
[redis-cluster](https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster)
71+
Bitnami Helm chart will be used to install Redis.
72+
73+
Tools Required:
74+
* [yq](https://github.com/mikefarah/yq)
75+
* [helm](https://helm.sh/docs/helm/helm_install/)
76+
77+
### Pre-requisites {#demo-pre-requisites}
78+
79+
To do this, I need two Kubernetes clusters that can both access common
80+
networking and storage; I've named my clusters `source` and `destination`.
81+
Specifically, I need:
82+
83+
* The `StatefulSetStartOrdinal` feature gate enabled on both clusters.
84+
* Client configuration for `kubectl` that lets me access both clusters as an
85+
administrator.
86+
* The same `StorageClass` installed on both clusters, and set as the default
87+
StorageClass for both clusters. This `StorageClass` should provision
88+
underlying storage that is accessible from either or both clusters.
89+
* A flat network topology that allows for pods to send and receive packets to
90+
and from Pods in either clusters. If you are creating clusters on a cloud
91+
provider, this configuration may be called private cloud or private network.
92+
93+
1. Create a demo namespace on both clusters:
94+
95+
```
96+
kubectl create ns kep-3335
97+
```
98+
99+
2. Deploy a Redis cluster with six replicas in the source cluster:
100+
101+
```
102+
helm repo add bitnami https://charts.bitnami.com/bitnami
103+
helm install redis --namespace kep-3335 \
104+
bitnami/redis-cluster \
105+
--set persistence.size=1Gi \
106+
--set cluster.nodes=6
107+
```
108+
109+
3. Check the replication status in the source cluster:
110+
111+
```
112+
kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \
113+
"redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;"
114+
```
115+
116+
```
117+
2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 myself,master - 0 1669764411000 3 connected 10923-16383
118+
7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669764410000 3 connected
119+
961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669764411000 1 connected
120+
7136e37d8864db983f334b85d2b094be47c830e5 10.104.0.15:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669764412595 2 connected
121+
a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669764411592 1 connected 0-5460
122+
2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922
123+
```
124+
125+
4. Deploy a Redis cluster with zero replicas in the destination cluster:
126+
127+
```
128+
helm install redis --namespace kep-3335 \
129+
bitnami/redis-cluster \
130+
--set persistence.size=1Gi \
131+
--set cluster.nodes=0 \
132+
--set redis.extraEnvVars\[0\].name=REDIS_NODES,redis.extraEnvVars\[0\].value="redis-redis-cluster-headless.kep-3335.svc.cluster.local" \
133+
--set existingSecret=redis-redis-cluster
134+
```
135+
136+
5. Scale down the `redis-redis-cluster` StatefulSet in the source cluster by 1,
137+
to remove the replica `redis-redis-cluster-5`:
138+
139+
```
140+
kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}'
141+
```
142+
143+
6. Migrate dependencies from the source cluster to the destination cluster:
144+
145+
The following commands copy resources from `source` to `destionation`. Details
146+
that are not relevant in `destination` cluster are removed (eg: `uid`,
147+
`resourceVersion`, `status`).
148+
149+
**Steps for the source cluster**
150+
151+
Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you
152+
should patch the PVs in `source` with `reclaimPolicy: Retain` prior to
153+
deletion to retain the underlying storage used in `destination`. See
154+
[Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/)
155+
for more details.
156+
157+
```
158+
kubectl get pvc redis-data-redis-redis-cluster-5 -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .status)' > /tmp/pvc-redis-data-redis-redis-cluster-5.yaml
159+
kubectl get pv $(yq '.spec.volumeName' /tmp/pvc-redis-data-redis-redis-cluster-5.yaml) -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .spec.claimRef, .status)' > /tmp/pv-redis-data-redis-redis-cluster-5.yaml
160+
kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml
161+
```
162+
163+
**Steps for the destination cluster**
164+
165+
Note: For the PV/PVC, this procedure only works if the underlying storage system
166+
that your PVs use can support being copied into `destination`. Storage
167+
that is associated with a specific node or topology may not be supported.
168+
Additionally, some storage systems may store addtional metadata about
169+
volumes outside of a PV object, and may require a more specialized
170+
sequence to import a volume.
171+
172+
```
173+
kubectl create -f /tmp/pv-redis-data-redis-redis-cluster-5.yaml
174+
kubectl create -f /tmp/pvc-redis-data-redis-redis-cluster-5.yaml
175+
kubectl create -f /tmp/secret-redis-redis-cluster.yaml
176+
```
177+
178+
7. Scale up the `redis-redis-cluster` StatefulSet in the destination cluster by
179+
1, with a start ordinal of 5:
180+
181+
```
182+
kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}'
183+
```
184+
185+
8. Check the replication status in the destination cluster:
186+
187+
```
188+
kubectl exec -it redis-redis-cluster-5 -- /bin/bash -c \
189+
"redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;"
190+
```
191+
192+
I should see that the new replica (labeled `myself`) has joined the Redis
193+
cluster (the IP address belongs to a different CIDR block than the
194+
replicas in the source cluster).
195+
196+
```
197+
2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669766684000 2 connected 5461-10922
198+
7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 myself,slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected
199+
2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 master - 0 1669766684000 3 connected 10923-16383
200+
961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669766683600 1 connected
201+
a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460
202+
7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected
203+
```
204+
205+
9. Repeat steps #5 to #7 for the remainder of the replicas, until the
206+
Redis StatefulSet in the source cluster is scaled to 0, and the Redis
207+
StatefulSet in the destination cluster is healthy with 6 total replicas.
208+
209+
## What's Next?
210+
211+
This feature provides a building block for a StatefulSet to be split up across
212+
clusters, but does not prescribe the mechanism as to how the StatefulSet should
213+
be migrated. Migration requires coordination of StatefulSet replicas, along with
214+
orchestration of the storage and network layer. This is dependent on the storage
215+
and connectivity requirements of the application installed by the StatefulSet.
216+
Additionally, many StatefulSets are managed by
217+
[operators](/docs/concepts/extend-kubernetes/operator/), which adds another
218+
layer of complexity to migration.
219+
220+
If you're interested in building enhancements to make these processes easier,
221+
get involved with
222+
[SIG Multicluster](https://github.com/kubernetes/community/blob/master/sig-multicluster)
223+
to contribute!

0 commit comments

Comments
 (0)