Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions docs/content/customization/aws/placement-group-nfd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
+++
title = "AWS Placement Group Node Feature Discovery"
+++

The AWS placement group NFD (Node Feature Discovery) customization automatically discovers and labels nodes with their placement group information, enabling workload scheduling based on placement group characteristics.

This customization will be available when the
[provider-specific cluster configuration patch]({{< ref "..">}}) is included in the `ClusterClass`.

## What is Placement Group NFD?

Placement Group NFD automatically discovers the placement group information for each node and creates node labels that can be used for workload scheduling. This enables:

- **Workload Affinity**: Schedule pods on nodes within the same placement group for low latency
- **Fault Isolation**: Schedule critical workloads on nodes in different placement groups
- **Resource Optimization**: Use placement group labels for advanced scheduling strategies

## How it Works

The NFD customization:

1. **Deploys a Discovery Script**: Automatically installs a script on each node that queries AWS metadata
2. **Queries AWS Metadata**: Uses EC2 instance metadata to discover placement group information
3. **Creates Node Labels**: Generates Kubernetes node labels with placement group details
4. **Updates Continuously**: Refreshes labels as nodes are added or moved

## Generated Node Labels

The NFD customization creates the following node labels:

| Label | Description | Example |
|-------|-------------|---------|
| `feature.node.kubernetes.io/aws-placement-group` | The name of the placement group | `my-cluster-pg` |
| `feature.node.kubernetes.io/partition` | The partition number (for partition placement groups) | `0`, `1`, `2` |

## Configuration

The placement group NFD customization is automatically enabled when a placement group is configured. No additional configuration is required.

```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <NAME>
spec:
topology:
variables:
- name: clusterConfig
value:
controlPlane:
aws:
placementGroup:
name: "control-plane-pg"
- name: workerConfig
value:
aws:
placementGroup:
name: "worker-pg"
```

## Usage Examples

### Workload Affinity

Schedule pods on nodes within the same placement group for low latency:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: high-performance-app
spec:
replicas: 3
selector:
matchLabels:
app: high-performance-app
template:
metadata:
labels:
app: high-performance-app
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: feature.node.kubernetes.io/aws-placement-group
operator: In
values: ["worker-pg"]
containers:
- name: app
image: my-app:latest
```

### Fault Isolation

Distribute critical workloads across different placement groups:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: critical-app
spec:
replicas: 6
selector:
matchLabels:
app: critical-app
template:
metadata:
labels:
app: critical-app
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["critical-app"]
topologyKey: feature.node.kubernetes.io/aws-placement-group
containers:
- name: app
image: critical-app:latest
```

### Partition-Aware Scheduling

For partition placement groups, schedule workloads on specific partitions:

```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: distributed-database
spec:
replicas: 3
selector:
matchLabels:
app: distributed-database
template:
metadata:
labels:
app: distributed-database
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: feature.node.kubernetes.io/partition
operator: In
values: ["0", "1", "2"]
containers:
- name: database
image: my-database:latest
```

## Verification

You can verify that the NFD labels are working by checking the node labels:

```bash
# Check all nodes and their placement group labels
kubectl get nodes --show-labels | grep placement-group

# Check specific node labels
kubectl describe node <node-name> | grep placement-group

# Check partition labels
kubectl get nodes --show-labels | grep partition
```

## Troubleshooting

### Check NFD Script Status

Verify that the discovery script is running:

```bash
# Check if the script exists on nodes
kubectl debug node/<node-name> -it --image=busybox -- chroot /host ls -la /etc/kubernetes/node-feature-discovery/source.d/

# Check script execution
kubectl debug node/<node-name> -it --image=busybox -- chroot /host cat /etc/kubernetes/node-feature-discovery/features.d/placementgroup
```

## Integration with Other Features

Placement Group NFD works seamlessly with:

- **Pod Affinity/Anti-Affinity**: Use placement group labels for advanced scheduling
- **Topology Spread Constraints**: Distribute workloads across placement groups

## Security Considerations

- The discovery script queries AWS instance metadata (IMDSv2)
- No additional IAM permissions are required beyond standard node permissions
- Labels are automatically managed and do not require manual intervention
- The script runs with appropriate permissions and security context
138 changes: 138 additions & 0 deletions docs/content/customization/aws/placement-group.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
+++
title = "AWS Placement Group"
+++

The AWS placement group customization allows the user to specify placement groups for control-plane
and worker machines to control their placement strategy within AWS.

This customization will be available when the
[provider-specific cluster configuration patch]({{< ref "..">}}) is included in the `ClusterClass`.

## What are Placement Groups?

AWS placement groups are logical groupings of instances within a single Availability Zone that influence how instances are placed on underlying hardware. They are useful for:

- **Cluster Placement Groups**: For applications that benefit from low network latency, high network throughput, or both
- **Partition Placement Groups**: For large distributed and replicated workloads, such as HDFS, HBase, and Cassandra
- **Spread Placement Groups**: For applications that have a small number of critical instances that should be kept separate

## Configuration

The placement group configuration supports the following field:

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `name` | string | Yes | The name of the placement group (1-255 characters) |

## Examples

### Control Plane and Worker Placement Groups

To specify placement groups for both control plane and worker machines:

```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <NAME>
spec:
topology:
variables:
- name: clusterConfig
value:
controlPlane:
aws:
placementGroup:
name: "control-plane-pg"
- name: workerConfig
value:
aws:
placementGroup:
name: "worker-pg"
```
### Control Plane Only
To specify placement group only for control plane machines:
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <NAME>
spec:
topology:
variables:
- name: clusterConfig
value:
controlPlane:
aws:
placementGroup:
name: "control-plane-pg"
```
### MachineDeployment Overrides
You can customize individual MachineDeployments by using the overrides field:
```yaml
spec:
topology:
# ...
workers:
machineDeployments:
- class: default-worker
name: md-0
variables:
overrides:
- name: workerConfig
value:
aws:
placementGroup:
name: "special-worker-pg"
```
## Resulting CAPA Configuration
Applying the placement group configuration will result in the following value being set:
- control-plane `AWSMachineTemplate`:

- ```yaml
spec:
template:
spec:
placementGroupName: control-plane-pg
```

- worker `AWSMachineTemplate`:

- ```yaml
spec:
template:
spec:
placementGroupName: worker-pg
```

## Best Practices

1. **Placement Group Types**: Choose the appropriate placement group type based on your workload:
- **Cluster**: For applications requiring low latency and high throughput
- **Partition**: For large distributed workloads that need fault isolation
- **Spread**: For critical instances that need maximum availability

2. **Naming Convention**: Use descriptive names that indicate the purpose and type of the placement group

3. **Availability Zone**: Placement groups are constrained to a single Availability Zone, so plan your cluster topology accordingly

4. **Instance Types**: Some instance types have restrictions on placement groups (e.g., some bare metal instances)

5. **Capacity Planning**: Consider the placement group capacity limits when designing your cluster

## Important Notes

- Placement groups must be created in AWS before they can be referenced
- Placement groups are constrained to a single Availability Zone
- You cannot move an existing instance into a placement group
- Some instance types cannot be launched in placement groups
- Placement groups have capacity limits that vary by type and instance family
Loading
Loading