Skip to content

Commit f548fc2

Browse files
committed
DRA API: implement CEL cost limit
The main purpose is to protect against denial-of-service attacks. Scheduling time depends a lot on unpredictable factors and expected scheduling time also varies, so no attempt is made to limit the overall time spent on evaluating CEL expressions per claim.
1 parent ff9ef07 commit f548fc2

File tree

11 files changed

+178
-5
lines changed

11 files changed

+178
-5
lines changed

api/openapi-spec/swagger.json

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/openapi-spec/v3/apis__resource.k8s.io__v1alpha3_openapi.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@
143143
"properties": {
144144
"expression": {
145145
"default": "",
146-
"description": "Expression is a CEL expression which evaluates a single device. It must evaluate to true when the device under consideration satisfies the desired criteria, and false when it does not. Any other result is an error and causes allocation of devices to abort.\n\nThe expression's input is an object named \"device\", which carries the following properties:\n - driver (string): the name of the driver which defines this device.\n - attributes (map[string]object): the device's attributes, grouped by prefix\n (e.g. device.attributes[\"dra.example.com\"] evaluates to an object with all\n of the attributes which were prefixed by \"dra.example.com\".\n - capacity (map[string]object): the device's capacities, grouped by prefix.\n\nExample: Consider a device with driver=\"dra.example.com\", which exposes two attributes named \"model\" and \"ext.example.com/family\" and which exposes one capacity named \"modules\". This input to this expression would have the following fields:\n\n device.driver\n device.attributes[\"dra.example.com\"].model\n device.attributes[\"ext.example.com\"].family\n device.capacity[\"dra.example.com\"].modules\n\nThe device.driver field can be used to check for a specific driver, either as a high-level precondition (i.e. you only want to consider devices from this driver) or as part of a multi-clause expression that is meant to consider devices from different drivers.\n\nThe value type of each attribute is defined by the device definition, and users who write these expressions must consult the documentation for their specific drivers. The value type of each capacity is Quantity.\n\nIf an unknown prefix is used as a lookup in either device.attributes or device.capacity, an empty map will be returned. Any reference to an unknown field will cause an evaluation error and allocation to abort.\n\nA robust expression should check for the existence of attributes before referencing them.\n\nFor ease of use, the cel.bind() function is enabled, and can be used to simplify expressions that access multiple attributes with the same domain. For example:\n\n cel.bind(dra, device.attributes[\"dra.example.com\"], dra.someBool && dra.anotherBool)",
146+
"description": "Expression is a CEL expression which evaluates a single device. It must evaluate to true when the device under consideration satisfies the desired criteria, and false when it does not. Any other result is an error and causes allocation of devices to abort.\n\nThe expression's input is an object named \"device\", which carries the following properties:\n - driver (string): the name of the driver which defines this device.\n - attributes (map[string]object): the device's attributes, grouped by prefix\n (e.g. device.attributes[\"dra.example.com\"] evaluates to an object with all\n of the attributes which were prefixed by \"dra.example.com\".\n - capacity (map[string]object): the device's capacities, grouped by prefix.\n\nExample: Consider a device with driver=\"dra.example.com\", which exposes two attributes named \"model\" and \"ext.example.com/family\" and which exposes one capacity named \"modules\". This input to this expression would have the following fields:\n\n device.driver\n device.attributes[\"dra.example.com\"].model\n device.attributes[\"ext.example.com\"].family\n device.capacity[\"dra.example.com\"].modules\n\nThe device.driver field can be used to check for a specific driver, either as a high-level precondition (i.e. you only want to consider devices from this driver) or as part of a multi-clause expression that is meant to consider devices from different drivers.\n\nThe value type of each attribute is defined by the device definition, and users who write these expressions must consult the documentation for their specific drivers. The value type of each capacity is Quantity.\n\nIf an unknown prefix is used as a lookup in either device.attributes or device.capacity, an empty map will be returned. Any reference to an unknown field will cause an evaluation error and allocation to abort.\n\nA robust expression should check for the existence of attributes before referencing them.\n\nFor ease of use, the cel.bind() function is enabled, and can be used to simplify expressions that access multiple attributes with the same domain. For example:\n\n cel.bind(dra, device.attributes[\"dra.example.com\"], dra.someBool && dra.anotherBool)\n\nThe length of the expression must be smaller or equal to 10 Ki. The cost of evaluating it is also limited based on the estimated number of logical steps.",
147147
"type": "string"
148148
}
149149
},

pkg/apis/resource/types.go

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -526,10 +526,37 @@ type CELDeviceSelector struct {
526526
//
527527
// cel.bind(dra, device.attributes["dra.example.com"], dra.someBool && dra.anotherBool)
528528
//
529+
// The length of the expression must be smaller or equal to 10 Ki. The
530+
// cost of evaluating it is also limited based on the estimated number
531+
// of logical steps.
532+
//
529533
// +required
530534
Expression string
531535
}
532536

537+
// CELSelectorExpressionMaxCost specifies the cost limit for a single CEL selector
538+
// evaluation.
539+
//
540+
// There is no overall budget for selecting a device, so the actual time
541+
// required for that is proportional to the number of CEL selectors and how
542+
// often they need to be evaluated, which can vary depending on several factors
543+
// (number of devices, cluster utilization, additional constraints).
544+
//
545+
// Validation against this limit and [CELSelectorExpressionMaxLength] happens
546+
// only when setting an expression for the first time or when changing it. If
547+
// the limits are changed in a future Kubernetes release, existing users are
548+
// guaranteed that existing expressions will continue to be valid and won't be
549+
// interrupted at runtime after an up- or downgrade.
550+
//
551+
// According to
552+
// https://github.com/kubernetes/kubernetes/blob/4aeaf1e99e82da8334c0d6dddd848a194cd44b4f/staging/src/k8s.io/apiserver/pkg/apis/cel/config.go#L20-L22,
553+
// this gives roughly 0.1 second for each expression evaluation.
554+
// However, this depends on how fast the machine is.
555+
const CELSelectorExpressionMaxCost = 1000000
556+
557+
// CELSelectorExpressionMaxLength is the maximum length of a CEL selector expression string.
558+
const CELSelectorExpressionMaxLength = 10 * 1024
559+
533560
// DeviceConstraint must have exactly one field set besides Requests.
534561
type DeviceConstraint struct {
535562
// Requests is a list of the one or more requests in this claim which

pkg/apis/resource/validation/validation.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,10 +170,19 @@ func validateCELSelector(celSelector resource.CELDeviceSelector, fldPath *field.
170170
if stored {
171171
envType = environment.StoredExpressions
172172
}
173+
if len(celSelector.Expression) > resource.CELSelectorExpressionMaxLength {
174+
allErrs = append(allErrs, field.TooLongMaxLength(fldPath.Child("expression"), "<value omitted>", resource.CELSelectorExpressionMaxLength))
175+
// Don't bother compiling too long expressions.
176+
return allErrs
177+
}
178+
173179
result := dracel.GetCompiler().CompileCELExpression(celSelector.Expression, envType)
174180
if result.Error != nil {
175181
allErrs = append(allErrs, convertCELErrorToValidationError(fldPath.Child("expression"), celSelector.Expression, result.Error))
182+
} else if result.MaxCost > resource.CELSelectorExpressionMaxCost {
183+
allErrs = append(allErrs, field.Forbidden(fldPath.Child("expression"), "too complex, exceeds cost limit"))
176184
}
185+
177186
return allErrs
178187
}
179188

pkg/apis/resource/validation/validation_resourceclaim_test.go

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ package validation
1818

1919
import (
2020
"fmt"
21+
"strings"
2122
"testing"
2223

2324
"github.com/stretchr/testify/assert"
@@ -316,6 +317,48 @@ func TestValidateClaim(t *testing.T) {
316317
return claim
317318
}(),
318319
},
320+
"CEL-length": {
321+
wantFailures: field.ErrorList{
322+
field.TooLongMaxLength(field.NewPath("spec", "devices", "requests").Index(1).Child("selectors").Index(1).Child("cel", "expression"), "<value omitted>", resource.CELSelectorExpressionMaxLength),
323+
},
324+
claim: func() *resource.ResourceClaim {
325+
claim := testClaim(goodName, goodNS, validClaimSpec)
326+
claim.Spec.Devices.Requests = append(claim.Spec.Devices.Requests, claim.Spec.Devices.Requests[0])
327+
claim.Spec.Devices.Requests[1].Name += "-2"
328+
expression := `device.driver == ""`
329+
claim.Spec.Devices.Requests[1].Selectors = []resource.DeviceSelector{
330+
{
331+
// Good selector.
332+
CEL: &resource.CELDeviceSelector{
333+
Expression: strings.ReplaceAll(expression, `""`, `"`+strings.Repeat("x", resource.CELSelectorExpressionMaxLength-len(expression))+`"`),
334+
},
335+
},
336+
{
337+
// Too long by one selector.
338+
CEL: &resource.CELDeviceSelector{
339+
Expression: strings.ReplaceAll(expression, `""`, `"`+strings.Repeat("x", resource.CELSelectorExpressionMaxLength-len(expression)+1)+`"`),
340+
},
341+
},
342+
}
343+
return claim
344+
}(),
345+
},
346+
"CEL-cost": {
347+
wantFailures: field.ErrorList{
348+
field.Forbidden(field.NewPath("spec", "devices", "requests").Index(0).Child("selectors").Index(0).Child("cel", "expression"), "too complex, exceeds cost limit"),
349+
},
350+
claim: func() *resource.ResourceClaim {
351+
claim := testClaim(goodName, goodNS, validClaimSpec)
352+
claim.Spec.Devices.Requests[0].Selectors = []resource.DeviceSelector{
353+
{
354+
CEL: &resource.CELDeviceSelector{
355+
Expression: `device.attributes["dra.example.com"].map(s, s.lowerAscii()).map(s, s.size()).sum() == 0`,
356+
},
357+
},
358+
}
359+
return claim
360+
}(),
361+
},
319362
}
320363

321364
for name, scenario := range scenarios {

pkg/generated/openapi/zz_generated.openapi.go

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

staging/src/k8s.io/api/resource/v1alpha3/generated.proto

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)