Skip to content

K8s: node selection page edits #1968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 60 additions & 110 deletions content/operate/kubernetes/recommendations/node-selection.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,28 +11,16 @@ linkTitle: Node selection
weight: 80
---

Many Kubernetes cluster deployments have different kinds of nodes that have
different CPU and memory resources available for scheduling cluster workloads.
Redis Enterprise for Kubernetes has various abilities to control the scheduling
Redis Enterprise cluster node pods through properties specified in the
Redis Enterprise cluster custom resource definition (CRD).
Kubernetes clusters often include nodes with different CPU and memory profiles. You control where Redis Enterprise cluster (REC) pods run by setting fields in the REC custom resource (CRD).

A Redis Enterprise cluster (REC) is deployed as a StatefulSet which manages the Redis Enterprise cluster node pods.
The scheduler chooses a node to deploy a new Redis Enterprise cluster node pod on when:
A Redis Enterprise cluster (REC) runs as a StatefulSet. The Kubernetes scheduler assigns nodes when you create or resize the cluster, or when a pod restarts.

- The cluster is created
- The cluster is resized
- A pod fails
Use these options to control pod placement:

Here are the ways that you can control the pod scheduling:
## Use node selectors

## Using node selectors

The [`nodeSelector`]({{<relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec">}})
property of the cluster specification uses the same values and structures as
the [Kubernetes `nodeSelector`](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector).
In general, node labels are a simple way to make sure that specific nodes are used for Redis Enterprise pods.
For example, if nodes 'n1' and 'n2' are labeled as "high memory":
The [`nodeSelector`]({{<relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec">}}) field matches the Kubernetes [`nodeSelector`](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector) syntax.
Label the nodes you want to target. For example, if nodes 'n1' and 'n2' are labeled with `memory=high`:

```sh
kubectl label nodes n1 memory=high
Expand All @@ -52,21 +40,13 @@ spec:
memory: high
```

Then, when the operator creates the StatefulSet associated with the pod, the nodeSelector
section is part of the pod specification. When the scheduler attempts to
create new pods, it needs to satisfy the node selection constraints.

The operator copies [`nodeSelector`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}) into the pod spec. The scheduler places pods only on nodes that match the selector.

## Using node pools
## Use node pools

A node pool is a common part of the underlying infrastructure of the Kubernetes cluster deployment and provider.
Often, node pools are similarly-configured classes of nodes such as nodes with the same allocated amount of memory and CPU.
Implementors often label these nodes with a consistent set of labels.
Node pools group similar nodes. Providers label nodes by pool.

On Google Kubernetes Engine (GKE), all node pools have the label `cloud.google.com/gke-nodepool` with a value of the name used during configuration.
On Microsoft Azure Kubernetes System (AKS), you can create node pools with a specific set of labels. Other managed cluster services may have similar labeling schemes.

You can use the `nodeSelector` section to request a specific node pool by label values. For example, on GKE:
Use [`nodeSelector`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}) to target a pool by label. For example, on GKE:

```yaml
apiVersion: app.redislabs.com/v1
Expand All @@ -79,24 +59,36 @@ spec:
cloud.google.com/gke-nodepool: 'high-memory'
```

## Using node taints
### Provider resources

Cloud providers label nodes by pool. See links below for specific documentation.

- GKE:
- [Create and manage cluster and node pool labels](https://cloud.google.com/kubernetes-engine/docs/how-to/creating-managing-labels)
- [Update node labels and taints for existing node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/update-existing-nodepools)
- AKS:
- [Use labels in an AKS cluster](https://learn.microsoft.com/en-us/azure/aks/use-labels)
- [Manage node pools in AKS](https://learn.microsoft.com/en-us/azure/aks/manage-node-pools)
- EKS:
- [Create a managed node group with labels (AWS CLI)](https://docs.aws.amazon.com/cli/latest/reference/eks/create-nodegroup.html)
- [Update a managed node group to add labels (AWS CLI)](https://docs.aws.amazon.com/cli/latest/reference/eks/update-nodegroup-config.html)


## Use node taints

You can use multiple node taints with a set of tolerations to control Redis Enterprise cluster node pod scheduling.
The `podTolerations` property of the cluster specification specifies a list of pod tolerations to use.
The value is a list of [Kubernetes tolerations](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/#concepts).
Use node taints and pod tolerations to control REC pod scheduling. Set tolerations with [`spec.podTolerations`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#specpodtolerations" >}}) (standard [Kubernetes tolerations](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/#concepts)).

For example, if the cluster has a single node pool, the node taints can control the allowed workloads for a node.
You can add taints to the node, for example nodes n1, n2, and n3, reserve a set of nodes for the Redis Enterprise cluster:
Example: on a single node pool, reserve nodes n1–n3 for REC by adding taints:

```sh
kubectl taint nodes n1 db=rec:NoSchedule
kubectl taint nodes n2 db=rec:NoSchedule
kubectl taint nodes n3 db=rec:NoSchedule
```

This prevents any pods from being scheduled onto the nodes unless the pods can tolerate the taint `db=rec`.
This blocks pods unless they tolerate the `db=rec` taint.

You can then add the toleration for this taint to the cluster specification:
Then add a matching toleration to the REC:

```yaml
apiVersion: app.redislabs.com/v1
Expand All @@ -107,17 +99,17 @@ spec:
nodes: 3
podTolerations:
- key: db
operator: Equal
operator: Equal
value: rec
effect: NoSchedule
```

A set of taints can also handle more complex use cases.
For example, a `role=test` or `role=dev` taint can be used to designate a node as dedicated for testing or development workloads via pod tolerations.

## Using pod anti-affinity
## Use pod anti-affinity

By default, the Redis Enterprise node pods are not allowed to be placed on the same node for the same cluster:
By default, REC node pods are not scheduled on the same node within the same cluster:

```yaml
podAntiAffinity:
Expand All @@ -130,10 +122,9 @@ podAntiAffinity:
topologyKey: kubernetes.io/hostname
```

Each pod has the three labels above where `redis.io/cluster` is the label for the name of your cluster.
Each pod has these labels. `redis.io/cluster` is your cluster name.

You can change this rule to restrict or include nodes that the Redis Enterprise cluster node pods can run on.
For example, you can delete the `redis.io/cluster` label so that even Redis Enterprise node pods from different clusters cannot be scheduled on the same Kubernetes node:
Modify this rule to widen or narrow placement. For example, remove the `redis.io/cluster` label to prevent pods from different clusters from sharing a node:

```yaml
apiVersion: app.redislabs.com/v1
Expand All @@ -151,9 +142,7 @@ spec:
topologyKey: kubernetes.io/hostname
```

or you can prevent Redis Enterprise nodes from being schedule with other workloads.
For example, if all database workloads have the label 'local/role: database', you
can use this label to avoid scheduling two databases on the same node:
To avoid co-locating with other database workloads, label those pods `local/role: database` and add anti-affinity to keep one database per node:

```yaml
apiVersion: app.redislabs.com/v1
Expand All @@ -175,39 +164,36 @@ spec:
topologyKey: kubernetes.io/hostname
```

In this case, any pods that are deployed with the label `local/role: database` cannot be scheduled on the same node.
Kubernetes will not schedule two pods with label `local/role: database` on the same node.

## Enable rack awareness

## Using rack awareness

You can configure Redis Enterprise with rack-zone awareness to increase availability
during partitions or other rack (or region) related failures.
Enable rack-zone awareness to improve availability during rack or zone failures.

{{%note%}}When creating your rack-zone ID, there are some constraints to consider; see [rack-zone awareness]({{< relref "/operate/rs/clusters/configure/rack-zone-awareness#rack-zone-id-rules" >}}) for more info. {{%/note%}}

Rack-zone awareness is a single property in the Redis Enterprise cluster CRD named `rackAwarenessNodeLabel`.
Configure it with [`spec.rackAwarenessNodeLabel`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}) in the REC.

### Choose a node label

The most common label used for rack-zone awareness is topology.kubernetes.io/zone, a standard Kubernetes label that shows the zone a node runs in. Many Kubernetes platforms add this label to nodes by default, as noted in the [Kubernetes documentation](https://kubernetes.io/docs/setup/best-practices/multiple-zones/#nodes-are-labeled).

If your platform doesn’t set this label automatically, you can use any custom label that describes the node’s topology (such as rack, zone, or region).

### Node labeling requirements
### Label all eligible nodes

{{< warning >}}

**All eligible nodes must have the label for rack-awareness to work. The operator requires every node that might run Redis Enterprise pods to be labeled. If any are missing the label, reconciliation will fail.
All eligible nodes **must** have the label for rack awareness to work. The operator requires every node that might run Redis Enterprise pods to be labeled. If any nodes are missing the label, reconciliation fails.
{{< /warning >}}

Eligible nodes are all nodes where Redis Enterprise pods can be scheduled. By default, these are all worker nodes in the cluster, but you can limit them using `spec.nodeSelector` in the Redis Enterprise cluster (REC) configuration.
Eligible nodes are nodes where REC pods can run. By default, this means all worker nodes. You can limit eligibility with [`spec.nodeSelector`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}).

The value for the chosen label must indicate the topology information (rack, zone, region, etc.) for each node.
Give each eligible node a label value that reflects its rack, zone, or region.

You can check the value for this label in your nodes with the command:
Check node label values:

```sh
$ kubectl get nodes -o custom-columns="name:metadata.name","rack\\zone:metadata.labels.topology\.kubernetes\.io/zone"
kubectl get nodes -o custom-columns="name:metadata.name","rack\\zone:metadata.labels.topology\.kubernetes\.io/zone"

name rack\zone
ip-10-0-x-a.eu-central-1.compute.internal eu-central-1a
Expand All @@ -216,69 +202,33 @@ ip-10-0-x-c.eu-central-1.compute.internal eu-central-1b
ip-10-0-x-d.eu-central-1.compute.internal eu-central-1b
```

### Enabling the cluster role

For the operator to read the cluster node information, you must create a cluster role for the operator and then bind the role to the service account.
### Enable the cluster role

Here's a cluster role:

```yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: redis-enterprise-operator
rules:
# needed for rack awareness
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list", "get", "watch"]
```
Grant the operator read access to node labels with a ClusterRole and ClusterRoleBinding.

And here's how to apply the role:
ClusterRole:

```sh
kubectl apply -f https://raw.githubusercontent.com/RedisLabs/redis-enterprise-k8s-docs/master/rack_awareness/rack_aware_cluster_role.yaml
```
{{<embed-yaml "k8s/rack_aware_cluster_role.md" "rack-aware-cluster-role.yaml">}}

The binding is typically to the `redis-enterprise-operator` service account:
Bind to the `redis-enterprise-operator` service account:

```yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: redis-enterprise-operator
subjects:
- kind: ServiceAccount
namespace: OPERATOR_NAMESPACE
name: redis-enterprise-operator
roleRef:
kind: ClusterRole
name: redis-enterprise-operator
apiGroup: rbac.authorization.k8s.io
```
{{<embed-yaml "k8s/rack_aware_cluster_role_binding.md" "rack-aware-cluster-role-binding.yaml">}}

and it can be applied by running:
Apply these files with `kubectl apply`. For example:

```sh
kubectl apply -f https://raw.githubusercontent.com/RedisLabs/redis-enterprise-k8s-docs/master/rack_awareness/rack_aware_cluster_role_binding.yaml
kubectl apply -f rack-aware-cluster-role.yaml
kubectl apply -f rack-aware-cluster-role-binding.yaml
```

Once the cluster role and the binding have been applied, you can configure Redis Enterprise clusters to use rack awareness labels.
After you apply the role and binding, you can configure rack awareness.

### Configuring rack awareness
### Configure rack awareness

You can configure the node label to read for the rack zone by setting the `rackAwarenessNodeLabel` property:
Set [`spec.rackAwarenessNodeLabel`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}) to the node label to use:

```yaml
apiVersion: app.redislabs.com/v1
kind: RedisEnterpriseCluster
metadata:
name: example-redisenterprisecluster
spec:
nodes: 3
rackAwarenessNodeLabel: topology.kubernetes.io/zone
```
{{<embed-yaml "k8s/rack_aware_rec.md" "rack-aware-cluster.yaml">}}

{{< note >}}
When you use the `rackAwarenessNodeLabel` property, the operator will change the topologyKey for the anti-affinity rule to the label name used unless you have specified the `podAntiAffinity` property as well. If you use `rackAwarenessNodeLabel` and `podAntiAffinity` together, you must make sure that the `topologyKey` in your pod anti-affinity rule is set to the node label name.
When you set [`spec.rackAwarenessNodeLabel`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#spec" >}}), the operator sets the anti-affinity `topologyKey` to that label unless you define [`spec.podAntiAffinity`]({{< relref "/operate/kubernetes/reference/api/redis_enterprise_cluster_api#specpodantiaffinity" >}}). If you define both, make sure `topologyKey` matches your node label.
{{< /note >}}
9 changes: 7 additions & 2 deletions content/operate/kubernetes/reference/yaml/rack-awareness.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This page provides YAML examples for deploying Redis Enterprise with [rack aware

- Label [Kubernetes nodes](https://kubernetes.io/docs/concepts/architecture/nodes/) with zone information
- Typically uses the standard label `topology.kubernetes.io/zone`
- Verify node labels: `kubectl get nodes --show-labels`
- Verify node labels: `kubectl get nodes -o custom-columns="name:metadata.name","rack\\zone:metadata.labels.topology\.kubernetes\.io/zone"`
- Install the [Redis Enterprise operator]({{< relref "/operate/kubernetes/deployment" >}})

For complete deployment instructions, see [Deploy on Kubernetes]({{< relref "/operate/kubernetes/deployment" >}}).
Expand All @@ -34,11 +34,13 @@ Rack awareness requires additional permissions to read [node labels](https://kub
{{<embed-yaml "k8s/rack_aware_cluster_role.md" "rack-aware-cluster-role.yaml">}}

Cluster role configuration:

- `name`: ClusterRole name for rack awareness permissions
- `rules`: Permissions to read nodes and their labels cluster-wide
- `resources`: Access to `nodes` resource for zone label discovery

Key permissions:

- `nodes`: Read access to discover node zone labels
- `get, list, watch`: Monitor node changes and zone assignments

Expand All @@ -49,6 +51,7 @@ The [ClusterRoleBinding](https://kubernetes.io/docs/reference/access-authn-authz
{{<embed-yaml "k8s/rack_aware_cluster_role_binding.md" "rack-aware-cluster-role-binding.yaml">}}

Cluster role binding configuration:

- `subjects.name`: Must match the service account name
- `subjects.namespace`: Namespace where the operator is deployed
- `roleRef.name`: Must match the cluster role name
Expand All @@ -60,6 +63,7 @@ The rack-aware [REC configuration]({{< relref "/operate/kubernetes/reference/api
{{<embed-yaml "k8s/rack_aware_rec.md" "rack-aware-cluster.yaml">}}

Rack-aware cluster configuration:

- `metadata.name`: Cluster name (cannot be changed after creation)
- `spec.rackAwarenessNodeLabel`: Node label used for zone identification
- `spec.nodes`: Minimum 3 nodes, ideally distributed across zones
Expand All @@ -69,14 +73,15 @@ Edit the values in the downloaded YAML file based on your environment, such as i
### Common zone labels

Different Kubernetes distributions use different zone labels:

- `Standard`: `topology.kubernetes.io/zone`
- `Legacy`: `failure-domain.beta.kubernetes.io/zone`
- `Custom`: Your organization's specific labeling scheme

Verify the correct label on your nodes:

```bash
kubectl get nodes -o custom-columns=NAME:.metadata.name,ZONE:.metadata.labels.'topology\.kubernetes\.io/zone'
kubectl get nodes -o custom-columns="name:metadata.name","rack\\zone:metadata.labels.topology\.kubernetes\.io/zone"
```

## Redis Enterprise database
Expand Down