|
| 1 | +--- |
| 2 | +title: Enabling Multiarch Tuning Operator on non-OKD K8S/CRI-O clusters |
| 3 | +authors: |
| 4 | + - "@lwan-wanglin" |
| 5 | +reviewers: |
| 6 | + - "@aleskandro" |
| 7 | + - "@AnnaZivkovic" |
| 8 | + - "@jeffdyoung" |
| 9 | + - "@Prashanth684" |
| 10 | +approvers: |
| 11 | + - "@aleskandro" |
| 12 | + - "@Prashanth684" |
| 13 | +creation-date: 2025-06-13 |
| 14 | +last-updated: 2025-07-02 |
| 15 | +tracking-link: |
| 16 | + - https://issues.redhat.com/browse/MULTIARCH-5324 |
| 17 | +see-also: |
| 18 | + - "/docs/enhancements/MTO-0001.md" |
| 19 | +--- |
| 20 | + |
| 21 | +# Supporting Multiarch Tuning Operator on non-OCP clusters |
| 22 | + |
| 23 | +## Summary |
| 24 | +Currently, several versions of the Multiarch Tuning Operator (MTO) have been successfully released on |
| 25 | +the OpenShift Container Platform (OCP). To expand its applicability beyond OCP, we plan to make the operator compatible |
| 26 | +with upstream Kubernetes clusters. |
| 27 | +Since the operator already supports deployment on CRI-O-based platforms like OCP, enabling it to run on |
| 28 | +non-OCP Kubernetes clusters that use CRI-O would be a logical and valuable first step. |
| 29 | + |
| 30 | +## OpenShift-Specific Dependencies |
| 31 | +To run the Multiarch Tuning Operator on other Kubernetes clusters, several OpenShift-specific dependencies need to be addressed: |
| 32 | +(a) `Global Pull Secret` – used to authenticate against image registries. In OpenShift, the global pull secret |
| 33 | +is stored as a `secret` named `pull-secret` in the `openshift-config` namespace and is configured |
| 34 | +by the cluster administrator. The Multiarch Tuning Operator watches this secret and uses it to authenticate |
| 35 | +when inspecting registry images. |
| 36 | +(b) `Global CA Bundle` – used to validate the TLS certificates of image registries during secure connections. |
| 37 | +In OpenShift, resources labeled with `config.openshift.io/inject-trusted-cabundle: "true"` are automatically injected with the cluster’s trusted CA bundle. |
| 38 | +The Multiarch Tuning Operator uses a `ConfigMap` named `trusted-ca`, which is labeled accordingly to receive the injected bundle. |
| 39 | +(c) TLS Certificates – used to secure communication for the Multiarch Tuning Operator. In OpenShift, resources labeled |
| 40 | +with `service.beta.openshift.io/inject-cabundle: "true"` are automatically injected with the cluster’s CA bundle, |
| 41 | +enabling clients to verify TLS certificates issued within the cluster. |
| 42 | +These resources are not automatically available in non-OCP clusters, and we should not expect users to manually create |
| 43 | +OpenShift-specific resources before installing the operator. |
| 44 | +In [MULTIARCH-5324](https://issues.redhat.com/browse/MULTIARCH-5324), we propose introducing two fields in the ClusterPodPlacementConfig CRD: |
| 45 | +(a) `GlobalPullSecretRef` - Allows users to reference their own pull secret, enabling the controller to inspect container images. |
| 46 | +(b) `CABundleConfigmapRef` - Allows users to reference a ConfigMap containing the CA bundle to verify registry TLS certificates. |
| 47 | +If not set, the operator will attempt to read CA certificates from cluster nodes and generate a default bundle automatically. |
| 48 | +(c) TLS Certificate Management – Automatically create and inject TLS certificates for the operator services, |
| 49 | +controller, and Webhook Configuration in non-OCP clusters. |
| 50 | +(d) Test Adjustments – Skip e2e test cases that rely on OpenShift-specific resources (e.g., `Build`, `DeploymentConfig`, or |
| 51 | +`image.config.openshift.io`) when running in non-OCP clusters. |
| 52 | + |
| 53 | +## Motivation |
| 54 | +- Supporting Kubernetes outside of OCP will make the Multiarch Tuning Operator more accessible and usable in diverse environments. |
| 55 | + |
| 56 | +### User Stories |
| 57 | +- As a cluster administrator, I want to install and use the Multiarch Tuning Operator on a standard Kubernetes cluster without requiring OpenShift-specific dependencies. |
| 58 | + |
| 59 | +### Goals |
| 60 | +- Provide a way for users to install and use the Multiarch Tuning Operator on their non-OCP Kubernetes clusters. |
| 61 | + |
| 62 | +### Non-Goals |
| 63 | +- The global pull secret must be provided by the user; the operator will not automatically create it for non-OCP Kubernetes clusters. |
| 64 | +- When the `CABundleConfigmapRef` field is not set, the operator will attempt to generate a default CA bundle. However, it does not guarantee that this bundle will be able to verify all registry TLS certificates. |
| 65 | + |
| 66 | +## Proposal |
| 67 | +We aim to make the operator compatible with standard Kubernetes clusters. |
| 68 | + |
| 69 | +Its development is expected to ship in the following phases: |
| 70 | +- Phase 1: Support for CRI-O Runtime-Based Kubernetes Clusters |
| 71 | + - Automatically generate TLS certificates for the required services and webhook configurations. We may consider depending on cert-manager for non-OCP **deployments**. |
| 72 | + - Provide a mechanism for users to reference their own global pull secrets. |
| 73 | + - Provide a mechanism for users to reference their own CA bundles. |
| 74 | + - Skip e2e test cases related to OpenShift-specific resources such as `Build`, `DeploymentConfig`, and `image.config.openshift.io` on non-OCP Kubernetes clusters. |
| 75 | +- Phase 2: Adding Containerd and Docker Support |
| 76 | + - Extend support beyond `CRI-O` to include more runtime environments e.g. `containerd` and `Docker`. |
| 77 | + - Implement additional validation and compatibility checks. |
| 78 | + - Ensure consistent behavior across different container runtimes. |
| 79 | + |
| 80 | +### Phase 1 |
| 81 | + |
| 82 | +#### Global pull secret for inspecting the images |
| 83 | +Regarding pull secret handling, we propose adding a `GlobalPullSecretRef` field in the `ClusterPodPlacementConfig` CRD for non-OCP clusters. |
| 84 | +The operator will read this reference and use the specified pull secret globally. |
| 85 | +For non-OCP clusters, users must create the pull secret beforehand, prior to installing the operator. |
| 86 | + |
| 87 | +#### CA bundle to verify registry certification |
| 88 | +Regarding CA Bundle handling, we propose adding a `CABundleConfigmapRef` field in the ClusterPodPlacementConfig CRD. |
| 89 | +This allows non-OCP users to reference a ConfigMap that contains the CA bundle used to verify registry TLS certificates. |
| 90 | +Users need to create this ConfigMap in the namespace where the operator will be installed. |
| 91 | +If this field is not specified, the operator will attempt to retrieve the CA bundle from the cluster nodes and |
| 92 | +automatically generate a default ConfigMap. |
| 93 | + |
| 94 | +#### TLS Certificate for the operator, the controllers and the webhook configuration |
| 95 | +For non-OCP clusters, we will implement a mechanism to generate and manage certificates for the following services |
| 96 | + - `multiarch-tuning-operator-controller-manager-service-cert` |
| 97 | + - `pod-placement-controller` |
| 98 | + - `pod-placement-web-hook` |
| 99 | +and webhook configurations. |
| 100 | +We are currently evaluating two options: |
| 101 | +Option 1: Follow the official Kubernetes documentation [Manage TLS Certificates in a Cluster](https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/) to implement certificate generation and rotation using native Kubernetes mechanisms. |
| 102 | +Option 2: Rely on an external `cert-manager` operator to automate the issuance and renewal of certificates. |
| 103 | +We prefer **Option 2**, as it simplifies certificate lifecycle management and is widely adopted in Kubernetes environments. |
| 104 | + |
| 105 | +#### Add Validation for Newly Added Fields |
| 106 | +For OpenShift clusters, users should not be allowed to configure the two newly added fields. Validation logic will be added in [clusterpodplacementconfig_webhook.go](https://github.com/openshift/multiarch-tuning-operator/blob/main/apis/multiarch/v1beta1/clusterpodplacementconfig_webhook.go) to restrict users from setting or modifying these fields when the operator is running on an OpenShift cluster. |
| 107 | +For non-OpenShift clusters, specifying a reference to the global pull secret in the `ClusterPodPlacementConfig` custom resource will be mandatory. This requirement will also be validated through logic implemented in the same webhook. |
| 108 | + |
| 109 | +#### ClusterPodPlacementConfig CR |
| 110 | + |
| 111 | +```go |
| 112 | +type ClusterPodPlacementConfigSpec struct { |
| 113 | + // CABundleConfigmapRef is a reference to the ConfigMap containing the CA bundle |
| 114 | + // used to verify registry TLS certificates. |
| 115 | + // +optional |
| 116 | + CABundleConfigmapRef *corev1.LocalObjectReference `json:"caBundleConfigmapRef,omitempty"` |
| 117 | + // GlobalPullSecretRef is a reference to the Secret used for authenticating image registry pulls. |
| 118 | + // +optional |
| 119 | + GlobalPullSecretRef *corev1.SecretReference `json:"globalPullSecretRef,omitempty"` |
| 120 | +} |
| 121 | +``` |
| 122 | + |
| 123 | +```yaml |
| 124 | +apiVersion: multiarch.openshift.io/v1beta1 |
| 125 | +kind: ClusterPodPlacementConfig |
| 126 | +metadata: |
| 127 | + name: cluster |
| 128 | +spec: |
| 129 | +... |
| 130 | + caBundleConfigmapRef: |
| 131 | + name: trusted-ca |
| 132 | + globalPullSecretRef: |
| 133 | + name: pull-secret |
| 134 | + namespace: pull-secret-ns |
| 135 | +... |
| 136 | +``` |
| 137 | +### Implementation Details/Notes/Constraints |
| 138 | + |
| 139 | +### Risks and Mitigations |
| 140 | + |
| 141 | +### Drawbacks |
| 142 | + |
| 143 | +### Open Questions |
| 144 | + |
| 145 | +### Test Plan |
| 146 | + |
| 147 | +#### Unit Testing and Integration Test Suites |
| 148 | + |
| 149 | +- Unit Testing: Test each new function, method, and feature in isolation to ensure correctness, reliability, and |
| 150 | + robustness. Verify that the new code paths are covered by the unit tests and that the code behaves as expected |
| 151 | + under different conditions. |
| 152 | +- Integration Test Suite: Run integration tests against a simulated control plane using the operator SDK's envtest |
| 153 | + facilities. We will add the necessary test cases to ensure the |
| 154 | + the new added field `GlobalPullSecretRef` and `CAbundleConfigmapRef` |
| 155 | + is working as expected. |
| 156 | + |
| 157 | +#### Functional Test Suite |
| 158 | + |
| 159 | +- The operator should reject configuration of `GlobalPullSecretRef` and `CABundleConfigmapRef` on OpenShift clusters. |
| 160 | +- The operator should not introduce any regressions on OpenShift clusters, and all e2e tests should pass. |
| 161 | +- The operator should allow setting `GlobalPullSecretRef` on standard Kubernetes clusters, and it can be used to inspect container images. |
| 162 | +- The operator should allow setting `CABundleConfigmapRef` on standard Kubernetes clusters, and it can be used to verify TLS connections to registries. |
| 163 | +- The operator should reject the creation or update of a `ClusterPodPlacementConfig` that does not include `GlobalPullSecretRef` when running on a non-OpenShift cluster. |
| 164 | +- If the `CABundleConfigmapRef` field is not set on non-OpenShift clusters, the operator should automatically create a CA bundle ConfigMap using certificates read from cluster nodes. |
| 165 | +- Skip OpenShift-specific e2e test cases on non-OpenShift clusters to ensure the test suite passes. |
| 166 | + |
| 167 | +### Graduation Criteria |
| 168 | + |
| 169 | +### Upgrade / Downgrade Strategy |
| 170 | +- No special upgrade/downgrade strategy is required for this enhancement. The operator will be updated to support the |
| 171 | +kubernetes clusters based cri-o runtime |
| 172 | + |
| 173 | +### Version Skew Strategy |
| 174 | + |
| 175 | +### Operational Aspects of API Extensions |
| 176 | + |
| 177 | +#### Failure Modes |
| 178 | +- Webhook failure - The `ClusterPodPlacementConfig` validating webhook is |
| 179 | + configured with "FailurePolicy=Fail". If the validation fails, the creation or update of a `ClusterPodPlacementConfig` resource will be blocked. |
| 180 | + |
| 181 | +## Documentation Plan |
| 182 | + |
| 183 | +Provide a detailed installation guide for Kubernetes users. |
| 184 | +Document how to configure global pull secrets, and CA bundles manually if needed. |
| 185 | + |
| 186 | +## Implementation History |
| 187 | + |
| 188 | +## Alternatives |
| 189 | + |
| 190 | +## Infrastructure Needed |
| 191 | + |
| 192 | +## Open Questions |
0 commit comments