|
| 1 | +# DRA Extended Resources Scale Test |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This test validates the performance and scalability of Kubernetes' DRA Extended Resources feature (KEP-5004). It measures how well the scheduler handles extended resource requests that are backed by Dynamic Resource Allocation (DRA), allowing applications to use familiar extended resource syntax while benefiting from DRA's dynamic allocation capabilities. |
| 6 | + |
| 7 | +## What This Test Does |
| 8 | + |
| 9 | +This test scenario mirrors the structure of the regular DRA test (`testing/dra/config.yaml`) but uses **extended resources syntax** instead of explicit ResourceClaims: |
| 10 | + |
| 11 | +1. **Setup Phase**: Creates a DeviceClass with an `extendedResourceName` field that maps DRA devices to traditional extended resources |
| 12 | +2. **Fill Phase**: Fills the cluster to 90% utilization with long-running pods that request extended resources (e.g., `example.com/gpu: 1`) |
| 13 | +3. **Measurement Phase**: Measures performance while continuously scheduling short-lived pods at a steady rate using the same extended resource requests |
| 14 | +4. **Metrics Collection**: Collects detailed metrics on: |
| 15 | + - Pod startup latency |
| 16 | + - Job lifecycle latency |
| 17 | + - Scheduler performance metrics |
| 18 | + - DRA-specific metrics (PrepareResources/UnprepareResources latencies) |
| 19 | + - Extended resource allocation metrics |
| 20 | + |
| 21 | +## Key Differences from Regular DRA Test |
| 22 | + |
| 23 | +- **No ResourceClaimTemplates**: Uses DeviceClass with `extendedResourceName` instead |
| 24 | +- **Extended Resource Syntax**: Pods request `example.com/gpu: 1` in `resources.limits` instead of using `resourceClaims` |
| 25 | +- **Transparent DRA**: The scheduler automatically creates ResourceClaims behind the scenes |
| 26 | +- **Backward Compatibility**: Tests that existing extended resource workloads work with DRA |
| 27 | + |
| 28 | +## Prerequisites |
| 29 | + |
| 30 | +1. **Feature Gate**: Ensure `DRAExtendedResource=true` is enabled on: |
| 31 | + - kube-apiserver |
| 32 | + - kube-scheduler |
| 33 | + - kubelet |
| 34 | + |
| 35 | +2. **DRA Driver**: A DRA driver must be running (installed automatically by the test) |
| 36 | + |
| 37 | +3. **Prometheus**: Required for metric-based measurements |
| 38 | + |
| 39 | +## Usage |
| 40 | + |
| 41 | +### Environment Variables |
| 42 | + |
| 43 | +```bash |
| 44 | +export CL2_MODE=Indexed # Job completion mode |
| 45 | +export CL2_NODES_PER_NAMESPACE=1 # Namespaces per node |
| 46 | +export CL2_LOAD_TEST_THROUGHPUT=20 # Fast initial fill rate |
| 47 | +export CL2_STEADY_STATE_QPS=5 # Controlled rate for measurement |
| 48 | +export CL2_JOB_RUNNING_TIME=30s # Short-lived pods runtime |
| 49 | +export CL2_LONG_JOB_RUNNING_TIME=1h # Long-running pods runtime |
| 50 | +export CL2_GPUS_PER_NODE=8 # Extended resources per node |
| 51 | +export CL2_FILL_PERCENTAGE=90 # Cluster fill percentage |
| 52 | +export CL2_EXTENDED_RESOURCE_NAME="example.com/gpu" # Extended resource name |
| 53 | +``` |
| 54 | + |
| 55 | +### Run the Test |
| 56 | + |
| 57 | +```bash |
| 58 | +# Make sure a Prometheus stack is deployed |
| 59 | +./run-e2e.sh cluster-loader2 \ |
| 60 | +--provider=kind \ |
| 61 | +--kubeconfig=/root/.kube/config \ |
| 62 | +--report-dir=/tmp/clusterloader2-results \ |
| 63 | +--testconfig=testing/dra-extended-resources/config.yaml \ |
| 64 | +--enable-prometheus-server=true \ |
| 65 | +--nodes=5 |
| 66 | +``` |
| 67 | + |
| 68 | +## Test Flow |
| 69 | + |
| 70 | +### 1. DeviceClass Creation |
| 71 | +Creates a DeviceClass that maps DRA devices to extended resources: |
| 72 | +```yaml |
| 73 | +apiVersion: resource.k8s.io/v1beta2 |
| 74 | +kind: DeviceClass |
| 75 | +metadata: |
| 76 | + name: gpu-extended-resource |
| 77 | +spec: |
| 78 | + selectors: |
| 79 | + - cel: |
| 80 | + expression: device.driver == 'gpu.example.com' && device.attributes['gpu.example.com'].type == 'gpu' |
| 81 | + extendedResourceName: example.com/gpu |
| 82 | +``` |
| 83 | +
|
| 84 | +### 2. Cluster Fill (90% utilization) |
| 85 | +- Creates long-running Jobs with pods requesting `example.com/gpu: 1` |
| 86 | +- Each pod gets a single extended resource unit |
| 87 | +- Scheduler automatically creates ResourceClaims behind the scenes |
| 88 | +- Fills cluster to specified percentage (default 90%) |
| 89 | + |
| 90 | +### 3. Steady State Churn |
| 91 | +- Creates short-lived Jobs at a controlled rate |
| 92 | +- Uses remaining 10% of cluster capacity |
| 93 | +- Measures scheduler performance under steady load |
| 94 | +- Tests both pod creation and cleanup performance |
| 95 | + |
| 96 | +### 4. Metrics Collection |
| 97 | +Collects comprehensive metrics including: |
| 98 | +- **Standard Metrics**: Pod startup latency, scheduling throughput |
| 99 | +- **DRA Metrics**: PrepareResources/UnprepareResources latencies |
| 100 | +- **Extended Resource Metrics**: Claim creation and allocation rates |
| 101 | +- **Comparison Data**: Allows comparison with regular DRA and baseline tests |
| 102 | + |
| 103 | +## Key Metrics |
| 104 | + |
| 105 | +### Pod Startup Latency |
| 106 | +- **FastFillPodStartupLatency**: Startup time for initial fill pods |
| 107 | +- **ChurnPodStartupLatency**: Startup time for steady-state pods |
| 108 | +- Thresholds: p50 < 40s, p90 < 60s, p99 < 80s |
| 109 | + |
| 110 | +### DRA Operation Latencies |
| 111 | +- **p99_dra_prepare_resources**: 99th percentile PrepareResources latency |
| 112 | +- **p99_dra_unprepare_operations**: 99th percentile UnprepareResources latency |
| 113 | +- **p99_dra_grpc_node_prepare_resources**: gRPC call latencies |
| 114 | +- **p99_dra_grpc_node_unprepare_resources**: gRPC cleanup latencies |
| 115 | + |
| 116 | +### Extended Resource Metrics |
| 117 | +- **extended_resource_claims_created**: Number of auto-created ResourceClaims |
| 118 | +- **extended_resource_allocation_attempts**: Allocation attempt rate |
| 119 | + |
| 120 | +## Comparison with Other Tests |
| 121 | + |
| 122 | +| Test | Resource Type | Syntax | Purpose | |
| 123 | +|------|---------------|--------|---------| |
| 124 | +| `dra/` | ResourceClaims | `resourceClaims` section | Test explicit DRA usage | |
| 125 | +| `dra-baseline/` | CPU/Memory | `resources.requests` | Baseline without DRA | |
| 126 | +| `dra-extended-resources/` | Extended Resources | `resources.limits` | Test DRA extended resources | |
| 127 | + |
| 128 | +## Expected Behavior |
| 129 | + |
| 130 | +1. **Transparent Operation**: Applications work without modification |
| 131 | +2. **Automatic Claim Creation**: Scheduler creates ResourceClaims automatically |
| 132 | +3. **DRA Driver Integration**: Same DRA driver calls as explicit ResourceClaims |
| 133 | +4. **Performance**: Similar performance to explicit DRA with additional scheduler overhead for claim creation |
| 134 | + |
| 135 | +## Troubleshooting |
| 136 | + |
| 137 | +### Common Issues |
| 138 | + |
| 139 | +1. **Feature Gate Not Enabled** |
| 140 | + - Error: Extended resource requests not creating ResourceClaims |
| 141 | + - Solution: Enable `DRAExtendedResource=true` on all components |
| 142 | + |
| 143 | +2. **DeviceClass Missing** |
| 144 | + - Error: Pods stuck in Pending state |
| 145 | + - Solution: Verify DeviceClass exists with correct `extendedResourceName` |
| 146 | + |
| 147 | +3. **Resource Conflicts** |
| 148 | + - Error: Both device plugin and DRA providing same extended resource |
| 149 | + - Solution: Use different extended resource names or migrate fully |
| 150 | + |
| 151 | +4. **Driver Issues** |
| 152 | + - Error: PrepareResources failures |
| 153 | + - Solution: Check DRA driver logs and CDI device creation |
| 154 | + |
| 155 | +### Debug Information |
| 156 | + |
| 157 | +- **Pod Status**: Check for `ExtendedResourceClaimStatus` showing claim mappings |
| 158 | +- **ResourceClaim Status**: Verify allocation results in auto-created claims |
| 159 | +- **Scheduler Logs**: Enable verbosity level 5 for extended resource processing |
| 160 | +- **Kubelet Logs**: Enable verbosity level 3 for DRA manager operations |
| 161 | + |
| 162 | +## Performance Expectations |
| 163 | + |
| 164 | +### Compared to Regular DRA |
| 165 | +- **Similar**: DRA driver operation latencies |
| 166 | +- **Additional**: Scheduler overhead for automatic claim creation |
| 167 | +- **Benefit**: Application compatibility without code changes |
| 168 | + |
| 169 | +### Compared to Baseline |
| 170 | +- **Additional**: DRA allocation and preparation overhead |
| 171 | +- **Additional**: ResourceClaim lifecycle management |
| 172 | +- **Benefit**: Dynamic device allocation and advanced scheduling |
| 173 | + |
| 174 | +## Use Cases |
| 175 | + |
| 176 | +1. **Migration Testing**: Validate migration from device plugins to DRA |
| 177 | +2. **Performance Validation**: Ensure extended resources don't add excessive overhead |
| 178 | +3. **Scale Testing**: Test scheduler performance with mixed resource types |
| 179 | +4. **Compatibility Testing**: Verify existing applications work with DRA backend |
| 180 | + |
| 181 | +--- |
| 182 | + |
| 183 | +*This test validates the DRA Extended Resources feature introduced in Kubernetes 1.34 (KEP-5004) and measures its performance characteristics at scale.* |
0 commit comments