Skip to content

Commit 8b30549

Browse files
authored
Merge pull request #4862 from SergeyKanzhelev/device-failures
updates to device failures for alpha2
2 parents eed7d65 + 629af2a commit 8b30549

File tree

2 files changed

+140
-50
lines changed

2 files changed

+140
-50
lines changed

keps/sig-node/4680-add-resource-health-to-pod-status/README.md

Lines changed: 136 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
- [e2e tests](#e2e-tests)
2323
- [Graduation Criteria](#graduation-criteria)
2424
- [Alpha](#alpha)
25+
- [Alpha2](#alpha2)
2526
- [Beta](#beta)
2627
- [GA](#ga)
2728
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
@@ -44,9 +45,9 @@
4445
Items marked with (R) are required *prior to targeting to a milestone / release*.
4546

4647
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
47-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
48-
- [ ] (R) Design details are appropriately documented
49-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
48+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
49+
- [X] (R) Design details are appropriately documented
50+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
5051
- [ ] e2e Tests for all Beta API Operations (endpoints)
5152
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
5253
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
@@ -94,7 +95,10 @@ Exposing unhealthy devices in Pod Status will provide a generic way to understan
9495

9596
As part of the InPlacePodVerticalScaling KEP, the two new fields were introduced in Pod Status to reflect the currently allocated resources for the Pod:
9697

97-
```
98+
```go
99+
type ContainerStatus struct {
100+
...
101+
98102
// AllocatedResources represents the compute resources allocated for this container by the
99103
// node. Kubelet sets this value to Container.Resources.Requests upon successful pod admission
100104
// and after successfully admitting desired pod resize.
@@ -107,6 +111,9 @@ As part of the InPlacePodVerticalScaling KEP, the two new fields were introduced
107111
// +featureGate=InPlacePodVerticalScaling
108112
// +optional
109113
Resources *ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,11,opt,name=resources"`
114+
115+
...
116+
}
110117
```
111118

112119
One field reflects the resource requests and limits and the other actual allocated resources.
@@ -117,60 +124,106 @@ The proposal is to keep this structure as-is to simplify parsing of well-known R
117124

118125
The proposal is to introduce an additional field:
119126

127+
```go
128+
type ContainerStatus struct {
129+
...
130+
131+
// AllocatedResourcesStatus represents the status of various resources
132+
// allocated for this Container. In case of DRA, the same resource health
133+
// can be reported multiple times if it is associated with the multiple containers.
134+
// +featureGate=ResourceHealthStatus
135+
// +optional
136+
// +patchMergeKey=name
137+
// +patchStrategy=merge
138+
// +listType=map
139+
// +listMapKey=name
140+
AllocatedResourcesStatus []ResourceStatus `json:"allocatedResourcesStatus,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,14,rep,name=allocatedResourcesStatus"`
141+
142+
...
143+
}
120144
```
121-
// AllocatedResourcesStatus represents the status of various resources
122-
// allocated for this Pod.
123-
AllocatedResourcesStatus ResourcesStatus
124145

125-
type ResourcesStatus map[ResourceName]ResourceStatus
146+
The `ResourceStatus` is defined as:
126147

148+
```go
127149
type ResourceStatus struct {
128-
// map of unique Resource ID to its health.
129-
// At a minimum, ResourceID must uniquely identify the Resource
130-
// allocated to the Pod on the Node for the lifetime of a Pod.
131-
// See ResourceID type for it's definition.
132-
Resources map[ResourceID] ResourceHealth
133-
134-
// allow to extend this struct in future with the overall health fields or things like Device Plugin version
150+
// Name of the resource. Must be unique within the pod and in case of non-DRA resource, match one of the resources from the pod spec.
151+
// For DRA resources, the value must be claim:claim_name/request.
152+
// The claim_name must match one of the claims from resourceClaims field in the podSpec.
153+
// +required
154+
Name ResourceName `json:"name" protobuf:"bytes,1,opt,name=name"`
155+
// List of unique Resources health. Each element in the list contains an unique resource ID and resource health.
156+
// At a minimum, ResourceID must uniquely identify the Resource
157+
// allocated to the Pod on the Node for the lifetime of a Pod.
158+
// See ResourceID type for it's definition.
159+
// +listType=map
160+
// +listMapKey=resourceID
161+
Resources []ResourceHealth `json:"resources,omitempty" protobuf:"bytes,2,rep,name=resources"`
135162
}
136163

164+
type ResourceHealthStatus string
165+
166+
const (
167+
ResourceHealthStatusHealthy ResourceHealthStatus = "Healthy"
168+
ResourceHealthStatusUnhealthy ResourceHealthStatus = "Unhealthy"
169+
ResourceHealthStatusUnknown ResourceHealthStatus = "Unknown"
170+
)
171+
137172
// ResourceID is calculated based on the source of this resource health information.
138173
// For DevicePlugin:
139-
// deviceplugin:DeviceID, where DeviceID is from the Device structure of DevicePlugin's ListAndWatchResponse type: https://github.com/kubernetes/kubernetes/blob/eda1c780543a27c078450e2f17d674471e00f494/staging/src/k8s.io/kubelet/pkg/apis/deviceplugin/v1alpha/api.proto#L61-L73
174+
//
175+
// DeviceID, where DeviceID is from the Device structure of DevicePlugin's ListAndWatchResponse type: https://github.com/kubernetes/kubernetes/blob/eda1c780543a27c078450e2f17d674471e00f494/staging/src/k8s.io/kubelet/pkg/apis/deviceplugin/v1alpha/api.proto#L61-L73
176+
//
140177
// DevicePlugin ID is usually a constant for the lifetime of a Node and typically can be used to uniquely identify the device on the node.
141178
// For DRA:
142-
// dra:<driver name>/<pool name>/<device name>: such a device can be looked up in the information published by that DRA driver to learn more about it. It is designed to be globally unique in a cluster.
179+
//
180+
// <driver name>/<pool name>/<device name>: such a device can be looked up in the information published by that DRA driver to learn more about it. It is designed to be globally unique in a cluster.
143181
type ResourceID string
144182

183+
// ResourceHealth represents the health of a resource. It has the latest device health information.
184+
// This is a part of KEP https://kep.k8s.io/4680 and historical health changes are planned to be added in future iterations of a KEP.
145185
type ResourceHealth struct {
146-
// List of conditions with the transition times
147-
Conditions []ResourceHealthCondition
186+
// ResourceID is the unique identifier of the resource. See the ResourceID type for more information.
187+
ResourceID ResourceID `json:"resourceID" protobuf:"bytes,1,opt,name=resourceID"`
188+
// Health of the resource.
189+
// can be one of:
190+
// - Healthy: operates as normal
191+
// - Unhealthy: reported unhealthy. We consider this a temporary health issue
192+
// since we do not have a mechanism today to distinguish
193+
// temporary and permanent issues.
194+
// - Unknown: The status cannot be determined.
195+
// For example, Device Plugin got unregistered and hasn't been re-registered since.
196+
//
197+
// In future we may want to introduce the PermanentlyUnhealthy Status.
198+
Health ResourceHealthStatus `json:"health,omitempty" protobuf:"bytes,2,name=health"`
148199
}
200+
```
201+
202+
In alpha2 in order to support pod level DRA resources, the following field will be added to the PodStatus:
203+
204+
```go
205+
// PodStatus represents information about the status of a pod. Status may trail the actual
206+
// state of a system.
207+
type PodStatus struct {
208+
209+
...
210+
211+
// Status of resource claims.
212+
// +featureGate=DynamicResourceAllocation
213+
// +optional
214+
ResourceClaimStatuses []PodResourceClaimStatus
149215

150-
// This condition type is replicating other condition types exposed by various status APIs
151-
type ResourceHealthCondition struct {
152-
// can be one of:
153-
// - Healthy: operates as normal
154-
// - Unhealthy: reported unhealthy. We consider this a temporary health issue
155-
// since we do not have a mechanism today to distinguish
156-
// temporary and permanent issues.
157-
// - Unknown: The status cannot be determined.
158-
// For example, Device Plugin got unregistered and hasn't been re-registered since.
159-
//
160-
// In future we may want to introduce the PermanentlyUnhealthy Status.
161-
Type string
162-
163-
// Status of the condition, one of True, False, Unknown.
164-
Status ConditionStatus
165-
// The last time the condition transitioned from one status to another.
166-
// +optional
167-
LastTransitionTime metav1.Time
168-
// The reason for the condition's last transition.
169-
// +optional
170-
Reason string
171-
// A human readable message indicating details about the transition.
172-
// +optional
173-
Message string
216+
...
217+
218+
// AllocatedResourcesStatus represents the status of various resources
219+
// allocated for this Pod, but not associated with any of containers.
220+
// +featureGate=ResourceHealthStatus
221+
// +optional
222+
// +patchMergeKey=name
223+
// +patchStrategy=merge
224+
// +listType=map
225+
// +listMapKey=name
226+
AllocatedResourcesStatus []ResourceStatus `json:"allocatedResourcesStatus,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,14,rep,name=allocatedResourcesStatus"`
174227
}
175228
```
176229

@@ -233,7 +286,6 @@ We should consider introducing another field to the Status that will be a free f
233286
### DRA implementation details
234287

235288
Today DRA does not return the health of the device back to kubelet. The proposal is to extend the
236-
237289
type `NamedResourcesInstance` (from [pkg/apis/resource/namedresources.go](https://github.com/kubernetes/kubernetes/blob/790dfdbe386e4a115f41d38058c127d2dd0e6f44/pkg/apis/resource/namedresources.go#L29-L37)) to include the Health field the same way it is done in
238290
the Device Plugin as well as a device ID.
239291

@@ -245,7 +297,41 @@ The API will be limited to "prepared" devices and include the claim `name/namesp
245297

246298
Kubelet will react on this field the same way as we propose to do it for the Device Plugin.
247299

248-
Specific implementation details will be added for the beta.
300+
The new method will be added to the same gRPC server that serves the [Node service](https://github.com/kubernetes/kubernetes/blob/04bba3c222bb2c5b1b1565713de4bf334ee7fbe4/staging/src/k8s.io/kubelet/pkg/apis/dra/v1alpha4/api.proto#L34) interface (Node service exposes
301+
`NodePrepareResources` and `NodeUnprepareResources`). The new interface will have a `Device` structure similar to Node Service's device, with the added `health` field:
302+
303+
``` proto
304+
service NodeHealth {
305+
...
306+
307+
// WatchDevicesStatus returns a stream of List of Devices
308+
// Whenever a Device state change or a Device disappears, WatchDevicesStatus
309+
// returns the new list.
310+
// This method is optional and may not be implemented.
311+
rpc WatchDevicesStatus(Empty) returns (stream DevicesStatusResponse) {}
312+
}
313+
314+
// ListAndWatch returns a stream of List of Devices
315+
// Whenever a Device state change or a Device disappears, ListAndWatch
316+
// returns the new list
317+
message DevicesStatusResponse {
318+
repeated Device devices = 1;
319+
}
320+
321+
message Device {
322+
... existing fields ...
323+
// The device itself. Required.
324+
string device_name = 3;
325+
... existing fields ...
326+
327+
// Health of the device, can be Healthy or Unhealthy.
328+
string Health = 5;
329+
}
330+
```
331+
332+
Implementation will ignore the `Unimplemented` error when the DRA plugin doesn't have this interface implemented.
333+
334+
Note, the gRPC details are still a subject to change and will go thru API review during the implementation.
249335

250336
### Test Plan
251337

@@ -296,9 +382,13 @@ Test coverage will be listed once tests are implemented.
296382
- Feature implemented in Device Manager behind a feature flag
297383
- Initial e2e tests completed and enabled
298384

385+
#### Alpha2
386+
387+
- Feature implemented in DRA behind a feature flag
388+
- e2e tests completed and enabled for DRA
389+
299390
#### Beta
300391

301-
- Feature is implemented in DRA behind the feature flag
302392
- Complete e2e tests coverage
303393

304394
#### GA

keps/sig-node/4680-add-resource-health-to-pod-status/kep.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,13 @@ stage: alpha #|beta|stable
2828
# The most recent milestone for which work toward delivery of this KEP has been
2929
# done. This can be the current (upcoming) milestone, if it is being actively
3030
# worked on.
31-
latest-milestone: "v1.31"
31+
latest-milestone: "v1.32"
3232

3333
# The milestone at which this feature was, or is targeted to be, at each stage.
3434
milestone:
35-
alpha: "v1.31"
36-
beta: "v1.32"
37-
stable: "v1.34"
35+
alpha: "v1.31" # 1.32 will contain an alpha2 with more features
36+
beta: "v1.33"
37+
stable: "v1.35"
3838

3939
# The following PRR answers are required at alpha release
4040
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)