Skip to content

Conversation

bharath-b-rh
Copy link
Contributor

PR has following changes:

  • Revisits the istiocsrs.operator.openshift.io API
  • Apart from API has major changes in below sections
    • Implementation Details/Notes/Constraints
    • Risks and Mitigations
    • Operational Aspects of API Extensions
    • Alternatives (Not Implemented)

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 2, 2025

@bharath-b-rh: This pull request references CM-706 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

PR has following changes:

  • Revisits the istiocsrs.operator.openshift.io API
  • Apart from API has major changes in below sections
    • Implementation Details/Notes/Constraints
    • Risks and Mitigations
    • Operational Aspects of API Extensions
    • Alternatives (Not Implemented)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 2, 2025
@bharath-b-rh
Copy link
Contributor Author

cc @TrilokGeer @mytreya-rh

@openshift-ci openshift-ci bot requested review from dgoodwin and jan--f September 2, 2025 12:08
Copy link
Contributor

openshift-ci bot commented Sep 2, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign prashanth684 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@chiragkyal chiragkyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass: I think we need to relook at the validations we are willing to add.

Comment on lines +265 to +257
// +kubebuilder:validation:MinLength:=0
// +kubebuilder:validation:MaxLength:=4096
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any restriction in terms of the length of the label selector? https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

Copy link
Contributor Author

@bharath-b-rh bharath-b-rh Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I referred this linked doc, however still it's not clear to me how we are getting 4096 as max length here.

The validation says:

  • Key: Prefix (253) / Name (63) = 317
  • Equality-based operator : 2
  • Value: 63

So a key/value pair would be of max length: 382

Per the following section

An upper limit of 4096 is fixed allowing
to configure up to 10 namespaces considering the limitations on the label selectors and when equality operators are used.

We are limiting 10 namespaces. So it's coming out as : 382*10+9 (commas) = 3829

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the calculations are correct. And we are setting a max limitation as 4096. The calculations are for max limits of each part in the selector, so it could be 10 or more than that, but this field value can only have 4096 chars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calculations are for the maximum limits of each part in the selector

Yeah, in the above calculation I took the maximum limit for all the sections, and it turns out that 10 namespaces can be accommodated within a 3829 length of string. Wondering why we need some more extra space (4096-3829)=267?
Let me know if my understanding is wrong here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upper limit for the field is 4096 characters not the number of selectors. And as suggested up to 10 can be added with the suggested the limit when selectors part are all of allowed max length. The delta shown in your calculation can be more if not all selectors are of max length.

Copy link
Member

@chiragkyal chiragkyal Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And as suggested up to 10 can be added with the suggested the limit when selectors part are all of allowed max length.

Maybe I am missing something here, but to my understanding, it seems like even if a user uses maxlength for all 10 selectors, they will still be left with the delta shown above, which feels like extra space that can accommodate 10(MaxLength)+x selectors.

@bharath-b-rh bharath-b-rh force-pushed the cm-706 branch 2 times, most recently from 6cc8bd6 to 32329dc Compare September 5, 2025 10:26

#### Operator uninstallation or `istiocsrs.operator.openshift.io` instance deletion.

Operator will remove all the resources created for installing `istio-csr` agent when `spec.cleanupOnDeletion` is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which CR status provides information about the deletion action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on what is the expectation. The field is used for determining the action when the CR is marked for deletion, we can't make use of status subresource, but we can emit event if the resources are deleted or not.

is active and should be able to update the certificate endpoint to `istio-csr` agent endpoint.
- As an OpenShift user, I want to have an option to dynamically enable monitoring for the `istio-csr` project and
to use the OpenShift monitoring solution when required.
- As an OpenShift user, I want to limit istio-csr functionality to specific namespaces for better security and control.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the following usecase list help to elaborate better?

  • As an OpenShift administrator, I want to have an option to deploy istio-csr agent, so that it can be enabled as a day2 operation.
  • As an OpenShift administrator, I want to be able to configure istio-csr agent, so that only required
    features can be enabled.
  • As an OpenShift administrator, I should be able to uninstall istio-csr agent when not required as a day2 operation without disrupting the cert-manager installation.
  • As an OpenShift administrator, I should be able to choose to keep secrets and related controller should clean up all resources created for the istio-csr agent deployment.
  • As an OpenShift security engineer, I want to be able to identify all artefacts created by istio-csr agent, for better auditability.
  • As an OpensShift SRE, I should be able to get details information as part of different status conditions and messages to identify the reasons of failures and carry out corrective actions successfully.
  • As an OpenShift service mesh administrator, I should be able to use istio-csr endpoint to automate certificate requests via istiod on the pre-installed service mesh clusters.
  • As an OpenShift SRE, I should be able to collect metrics for istio-csr for monitoring.
  • As an OpenShift administrator, I should be able to configure rbac permissions for configmap resources to be applicable to only select namespaces.
  • As an OpenShift security engineer, I want to restrict istio-csr operation to only member namespaces of the Service Mesh, so that tenants outside the mesh cannot request certs.
  • As an OpenShift administrator, I want to configure istio cluster id that can be verified against the csr to avoid misconfiguration or infer any default value

Non goals

  • As an OpenShift security engineer, I want an automatic deletion of istio-ca-root-cert configmap from a selected namespace.
  • As an OpenShift administrator, I want the namespaces chosen for istio-ca-root-cert configmap injection to validated against service mesh configuration to avoid drift in the namespace selection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TrilokGeer Could you please help me understand below user stories

As an OpenShift administrator, I should be able to configure rbac permissions for configmap resources to be applicable to only select namespaces.

IIUC, is this proposal for RBAC hardening, to dynamically update the namespaces in istio-csr specific ClusterRole with those matching the selector configured by user for IstioDataPlaneNamespaceSelector? Does it mean, the operator will need watch on Namespace resource and check whether the labels exist on that particular namespace and update the ClusterRole. And when the label is removed or namespace is deleted, update the ClusterRole accordingly. So this would mean having new controller to support this functionality correct?

As an OpenShift security engineer, I want to restrict istio-csr operation to only member namespaces of the Service Mesh, so that tenants outside the mesh cannot request certs.

I think this is the functionality of the istio because the proxies will send the CertificateRequests to istiod, and istiod will act as a proxy here for certificate requests and further forward it to istio-csr as a gRPC request.
Ref: https://github.com/istio/istio/blob/master/architecture/security/istio-agent.md

Comment on lines +265 to +257
// +kubebuilder:validation:MinLength:=0
// +kubebuilder:validation:MaxLength:=4096
Copy link
Member

@chiragkyal chiragkyal Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And as suggested up to 10 can be added with the suggested the limit when selectors part are all of allowed max length.

Maybe I am missing something here, but to my understanding, it seems like even if a user uses maxlength for all 10 selectors, they will still be left with the delta shown above, which feels like extra space that can accommodate 10(MaxLength)+x selectors.

@bharath-b-rh
Copy link
Contributor Author

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Sep 18, 2025
bharath-b-rh

This comment was marked as off-topic.

@TrilokGeer
Copy link
Contributor

TrilokGeer commented Sep 25, 2025

Few of the user stories may not be relevant to be handled as part of the GA, it's sugggested to add them as part of risks/constraints or non-goals and possibly consider for future iterations.

Copy link
Contributor

openshift-ci bot commented Sep 26, 2025

@bharath-b-rh: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bharath-b-rh
Copy link
Contributor Author

/label tide/merge-method-squash

@mytreya-rh
Copy link

mytreya-rh commented Sep 26, 2025

/lgtm

cc: @TrilokGeer Thanks a lot for the in-call review yesterday. Summarizing the changes for your feedback.
The new API changes are captured.

Following use cases were moved to Non-Goals:

  1. Resource cleanup including configmap cleanup after the namespace goes out of selection
  2. RBAC for agent to restrict configmap creation in select namespaces

Risks & Mitigation chapter was updated:

  1. Steps to identify ConfigMaps created by the istio-csr but which are no more part of the namespace selector
  2. About protecting the secret that holds the istiod certificate key pair.

@bharath-b-rh kindly add anything that i missed

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants