Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

42 changes: 21 additions & 21 deletions Cargo.nix

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,4 @@ walkdir = "2.5.0"

[patch."https://github.com/stackabletech/operator-rs.git"]
# stackable-operator = { path = "../operator-rs/crates/stackable-operator" }
# stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "main" }
stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "feat/listenerclass-stickiness" }
14 changes: 7 additions & 7 deletions crate-hashes.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions deploy/helm/listener-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,21 @@ spec:
- LoadBalancer
- ClusterIP
type: string
stickyNodePorts:
default: false
description: |-
Wether a Pod exposed using a NodePort should be pinned to a specific Kubernetes node.

By pinning the Pod to a specific (stable) Kubernetes node, stable addresses can be
provided using NodePorts. The stickiness is achieved by listener-operator by setting the
`volume.kubernetes.io/selected-node` annotation on the Listener PVC.

However, this only works on setups with long-living nodes. If your nodes are rotated on
a regular basis, the Pods previously running on a removed node will be stuck in Pending
until you delete the PVC with the stickiness.

Because of this we don't enable stickiness by default to support all environments.
type: boolean
required:
- serviceType
type: object
Expand Down
3 changes: 3 additions & 0 deletions deploy/helm/listener-operator/templates/listener-classes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ metadata:
name: external-unstable
spec:
serviceType: NodePort
stickyNodePorts: false
---
apiVersion: listeners.stackable.tech/v1alpha1
kind: ListenerClass
Expand All @@ -36,6 +37,7 @@ metadata:
name: external-unstable
spec:
serviceType: NodePort
stickyNodePorts: false
---
apiVersion: listeners.stackable.tech/v1alpha1
kind: ListenerClass
Expand All @@ -51,6 +53,7 @@ spec:
# or on-premise environments that don't support external LoadBalancer peering (such as Calico (https://docs.tigera.io/calico/latest/networking/configuring/advertise-service-ips)
# or MetalLB (https://metallb.org/)).
serviceType: NodePort
stickyNodePorts: true
{{ else }}
{{ fail "An invalid preset was configured" }}
{{ end }}
4 changes: 2 additions & 2 deletions deploy/helm/listener-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -134,11 +134,11 @@ labels:
# Kubelet dir may vary in environments such as microk8s, see https://github.com/stackabletech/secret-operator/issues/229
kubeletDir: /var/lib/kubelet

# Options: none, stable-nodes, ephemeral-nodes
# Options: none, stable-nodes, ephemeral-nodes (default)
# none: No ListenerClasses are preinstalled, administrators must supply them themselves
# stable-nodes: ListenerClasses are preinstalled that are suitable for on-prem/"pet" environments, assuming long-running Nodes but not requiring a LoadBalancer controller
# ephemeral-nodes: ListenerClasses are preinstalled that are suitable for cloud/"cattle" environments with short-lived nodes, however this requires a LoadBalancer controller to be installed
preset: stable-nodes
preset: ephemeral-nodes

# See all available options and detailed explanations about the concept here:
# https://docs.stackable.tech/home/stable/concepts/telemetry/
Expand Down
29 changes: 17 additions & 12 deletions docs/modules/listener-operator/pages/listenerclass.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ The Stackable Data Platform expects these three ListenerClasses to exist:
== Presets

To help users get started, the Stackable Listener Operator ships different ListenerClass _presets_ for different environments.
These are configured using the `preset` Helm value, with `stable-nodes` being the default.
These are configured using the `preset` Helm value, with `ephemeral-nodes` being the default.

=== Installation Commands

Expand Down Expand Up @@ -83,21 +83,25 @@ Both `stable-nodes` and `ephemeral-nodes` create the same three ListenerClasses
|ClusterIP

|`external-unstable`
|NodePort
|NodePort
|NodePort (non-sticky)
|NodePort (non-sticky)

|`external-stable`
|NodePort
|NodePort (sticky)
|LoadBalancer
|===

A sticky NodePort pins the Pod to a particular Kubernetes node, so that the endpoint is stable across Pod restarts.
This is achieved via the `volume.kubernetes.io/selected-node` annotation on the Listener PVC.

==== Why the Difference?

* **stable-nodes**: Uses NodePort for external access and pins pods to specific nodes for address stability.
* **stable-nodes**: Uses NodePort for external access and pins external-stable pods to specific nodes for address stability.
+
[CAUTION]
====
This creates a dependency on specific nodes. If a pinned node becomes unavailable, the pod cannot start on other nodes until you either restore the node or manually delete the PVC to allow rescheduling.
This creates a dependency on specific nodes when external-stable is used.
If a pinned node becomes unavailable, the pod cannot start on other nodes until you either restore the node or manually delete the PVC to allow rescheduling.
====
+
.To recover from node failures:
Expand Down Expand Up @@ -131,23 +135,24 @@ The key is understanding your environment's requirements.
==== NodePort
* **Use for**: External access (from outside the Kubernetes cluster) in environments with stable nodes
* **Access**: From outside the cluster via `<NodeIP>:<NodePort>`
* **Behavior**: Pins pods to specific nodes for address stability
* **Behavior**: You can configure if Pods should be pinned to specific nodes for address stability

[WARNING]
====
NodePort services may expose your applications to the internet if your Kubernetes nodes have public IP addresses.
Ensure you understand your cluster's network topology and have appropriate firewall rules in place.
====

===== Node stickiness

Using `.spec.stickyNodePorts` (defaults to `false`) you can enable that Pods are xref:volume.adoc#pinning[pinned] to a specific Kubernetes node.

[CAUTION]
====
When using NodePort with pinned pods, service addresses depend on specific nodes. If a pinned node becomes unavailable, the service may become unreachable until the pod can be rescheduled to a new node, potentially changing the service address.
When using NodePort with pinned pods, service addresses depend on specific nodes.
If a pinned node becomes unavailable, the service may become unreachable until the pod can be rescheduled to a new node, potentially changing the service address.
====

Pods bound to `NodePort` listeners will be xref:volume.adoc#pinning[pinned] to a specific Node for address stability.
If this behavior is undesirable, consider using xref:#servicetype-loadbalancer[] instead.


[#servicetype-loadbalancer]
==== LoadBalancer
* **Use for**: External access in environments without stable nodes or other reasons for a LoadBalancer
Expand Down
35 changes: 21 additions & 14 deletions rust/operator-binary/src/csi_server/controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -127,26 +127,33 @@ impl csi::v1::controller_server::Controller for ListenerOperatorController {
.within(&ns)
.erase(),
})?;

// We only configure a node stickiness in case it is enabled and the Service is of type
// NodePort. Load balancers and services of type ClusterIP have no relationship to any
// particular node, so don't try to be sticky.
let accessible_topology = if listener_class.spec.sticky_node_ports
&& listener_class.spec.service_type == listener::v1alpha1::ServiceType::NodePort
{
// Pick the top node (as selected by the CSI client) and "stick" to that
// Since we want clients to have a stable address to connect to
request
.accessibility_requirements
.unwrap_or_default()
.preferred
.into_iter()
.take(1)
.collect()
} else {
Vec::new()
};

Ok(Response::new(csi::v1::CreateVolumeResponse {
volume: Some(csi::v1::Volume {
capacity_bytes: 0,
volume_id: request.name,
volume_context: raw_volume_context.into_iter().collect(),
content_source: None,
accessible_topology: match listener_class.spec.service_type {
// Pick the top node (as selected by the CSI client) and "stick" to that
// Since we want clients to have a stable address to connect to
listener::v1alpha1::ServiceType::NodePort => request
.accessibility_requirements
.unwrap_or_default()
.preferred
.into_iter()
.take(1)
.collect(),
// Load balancers and services of type ClusterIP have no relationship to any particular node, so don't try to be sticky
listener::v1alpha1::ServiceType::LoadBalancer
| listener::v1alpha1::ServiceType::ClusterIP => Vec::new(),
},
accessible_topology,
}),
}))
}
Expand Down
Loading