Skip to content

Commit 6bc66ed

Browse files
authored
Merge pull request #1845 from mfranczy/nfd-image-compatibility
NFD image compatibility proposal
2 parents 86d2809 + f435c1b commit 6bc66ed

File tree

1 file changed

+160
-0
lines changed
  • enhancements/1845-nfd-image-compatibility

1 file changed

+160
-0
lines changed
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# KEP-1845: Image Compatibility with NFD
2+
3+
## Summary
4+
5+
Currently, there is no standard solution for describing container image requirements in relation to hardware or operating systems.
6+
Cloud-native technologies are being adopted by high-demand industries where container compatibility is critical for service performance and cluster preparation.
7+
This proposal introduces the concept of NFD image compatibility metadata.
8+
NFD features via NodeFeatureRules CRs can be effectively added to images to specify requirements for a host or operating system.
9+
10+
The document has been prepared based on the experience and progress of the [OCI Image Compatibility working group](https://github.com/opencontainers/wg-image-compatibility/tree/main/docs/proposals).
11+
12+
## Motivation
13+
14+
Image compatibility metadata will help container image authors describe compatibility requirements in a standardized way.
15+
This metadata will be uploaded with the image to the image registry.
16+
As a result, container compatibility requirements will become discoverable and programmable, supporting various consumers and use cases where applications require a specific compatible environment.
17+
18+
### Goals
19+
20+
#### Phase 1
21+
22+
- Use existing NFD features via the NodeFeatureRule API to describe container image requirements.
23+
- Create a new OCI artifact type for compatibility metadata.
24+
- Allow verification of node compatibilitym including nodes that are not yet part of the k8s cluster.
25+
- Add or extend the sources with missing features.
26+
27+
#### Phase 2
28+
29+
Phase 2 involves future prediction and shows the general direction.
30+
After the completion of Phase 1, either this document should be updated, or a new proposal should be created that considers the following points:
31+
32+
- Update or generate pods with appropriate node selectors via a mutation webhook or a scheduler plugin.
33+
34+
### Non-Goals
35+
36+
- Make image compatibility a hard requirement for the NFD installation/usage.
37+
- Cover applications ABI compatibility.
38+
39+
## Proposal
40+
41+
Build a new NFD client tool with the following initial scope:
42+
43+
- CRUD OCI artifact.
44+
- Validate nodes based on provided metadata.
45+
- Run directly on a host which is not part of the Kubernetes cluster, or run as a Kubernetes job on a Kubernetes node.
46+
47+
### Design Details
48+
49+
#### OCI Artifact
50+
51+
[An OCI artifact](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage) should be created to store image compatibility metadata on the image side.
52+
The artifact can be connected with an image over [the subject field](https://github.com/opencontainers/distribution-spec/blob/11b8e3fba7d2d7329513d0cff53058243c334858/spec.md#pushing-manifests-with-subject).
53+
54+
##### Manifest
55+
56+
```json
57+
{
58+
"schemaVersion": 2,
59+
"mediaType": "application/vnd.oci.image.manifest.v1+json",
60+
"artifactType": "application/vnd.k8s.nfd.image-compatibility.v1",
61+
"config": {
62+
"mediaType": "application/vnd.oci.empty.v1+json",
63+
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
64+
"size": 2
65+
},
66+
"layers": [
67+
{
68+
"mediaType": "application/vnd.k8s.nfd.image-compatibility.spec.v1+yaml",
69+
"digest": "sha256:4a47f8ae4c713906618413cb9795824d09eeadf948729e213a1ba11a1e31d052",
70+
"size": 1710
71+
}
72+
],
73+
"subject": {
74+
"mediaType": "application/vnd.oci.image.manifest.v1+json",
75+
"digest": "sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270",
76+
"size": 7682
77+
},
78+
"annotations": {
79+
"oci.opencontainers.image.created": "2024-03-27T08:08:08Z"
80+
}
81+
}
82+
```
83+
84+
##### Artifact Payload (Schema)
85+
86+
- **version** - *string*
87+
This REQUIRED property specifies the version of the API being used.
88+
89+
- **compatibilities** - *array of object*
90+
This REQUIRED property is a list of compatibility sets.
91+
92+
- **rules** - *object*
93+
This REQUIRED property is a reference to the spec of [NodeFeatureRule API](https://kubernetes-sigs.github.io/node-feature-discovery/v0.16/usage/custom-resources.html#nodefeaturerule).
94+
The spec makes it possible to describe image requirements using the discovered features from NFD sources.
95+
For further reading, please review [the documentation](https://kubernetes-sigs.github.io/node-feature-discovery/v0.16/usage/customization-guide.html#nodefeaturerule-custom-resource).
96+
97+
- **weight** - *int*
98+
This OPTIONAL property specify the [node affinity weight](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-weight).
99+
100+
- **tag** - *string*
101+
This OPTIONAL property allows grouping or dividing of compatibility sets.
102+
103+
- **description** - *string*
104+
This OPTIONAL property is intended for a brief description of a compatibility set.
105+
106+
Example
107+
108+
```yaml
109+
version: v1alpha1
110+
compatibilities:
111+
- tag: "prefered"
112+
weight: 10
113+
description: "Prefered node configuration"
114+
rules:
115+
- name: "kernel and cpu"
116+
matchFeatures:
117+
- feature: kernel.loadedmodule
118+
matchExpressions:
119+
vfio-pci: {op: Exists}
120+
- feature: cpu.model
121+
matchExpressions:
122+
vendor_id: {op: In, value: ["Intel", "Amd"]}
123+
- tag: "fallback"
124+
weight: 1
125+
description: "Minimal required configuration"
126+
rules:
127+
- name: "cpu"
128+
matchFeatures:
129+
- feature: cpu.model
130+
matchExpressions:
131+
vendor_id: {op: In, value: ["Intel", "Amd"]}
132+
```
133+
134+
##### Discovery
135+
136+
A compatibility artifact shall be associated with either an image index or a specific image via the subject field of the OCI Image Spec.
137+
The Referrers API should be used to discover artifacts.
138+
If an image has multiple artifacts, it is up to the client to choose the correct one.
139+
By default, it is recommended to select the most recent artifact based on the 'created' timestamp.
140+
141+
#### NFD client
142+
143+
A new standalone command-line utility should be implemented for the NFD project that shares the same functionality as the [nfd kubectl plugin](https://nfd.sigs.k8s.io/usage/kubectl-plugin).
144+
Both clients should implemented the following commands:
145+
146+
- `validate` - validate a NodeFeatureRule object (implemented in kubectl plugin).
147+
- `test` - test a NodeFeatureRule object against a node (implemented in kubectl plugin).
148+
- `dryrun` - process a NodeFeatureRule file against a local NodeFeature file to dry run the rule against a node before applying it to a cluster (implemented in kubectl plugin).
149+
- `compat` - compatibility command with the following subcommands:
150+
- `attach-spec` - create an artifact with image compatibility specification and attach to the image (initially users have to create the spec by hand).
151+
- `remove-spec` - remove an artifact with image compatibility specification from the image.
152+
- `validate-spec` - validate an artifact and image compatibility specification.
153+
- `validate-node` - validate image compatibility against a node.
154+
155+
### Test Plan
156+
157+
To ensure the proper functioning of the nfd client, the following test plan should be executed:
158+
159+
- **Unit Tests:** Write unit tests for the client.
160+
- **Manual e2e Tests:** Run nfd client with sample data to CRUD artifact and validate a local host.

0 commit comments

Comments
 (0)