Skip to content

Add ARO HCP proposal for CAPZ support#6134

Open
marek-veber wants to merge 2 commits intokubernetes-sigs:mainfrom
marek-veber:aro-proposal
Open

Add ARO HCP proposal for CAPZ support#6134
marek-veber wants to merge 2 commits intokubernetes-sigs:mainfrom
marek-veber:aro-proposal

Conversation

@marek-veber
Copy link

Adds a comprehensive proposal document for implementing Azure Red Hat OpenShift Hosted Control Plane (ARO HCP) cluster provisioning using Cluster API Provider Azure (CAPZ).

The proposal covers:

  • Architecture and design for ARO HCP integration with CAPZ
  • ASO resource definitions and CAPI contract implementation
  • KeyVault and encryption key management
  • Security considerations and alternatives

This document serves as the technical foundation for ARO HCP CAPZ development and guides customers deploying ARO HCP clusters via CAPZ.

What type of PR is this?

/kind design

What this PR does / why we need it:
We want to design how to provision Azure Red Hat OpenShift Hosted Control Plane (ARO HCP) clusters using CAPZ,

@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Mar 3, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @marek-veber. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 3, 2026
Adds comprehensive proposal document for implementing Azure Red Hat
OpenShift Hosted Control Plane (ARO HCP) cluster provisioning using
Cluster API Provider Azure (CAPZ).

The proposal covers:
- Architecture and design for ARO HCP integration with CAPZ
- ASO resource definitions and CAPI contract implementation
- KeyVault and encryption key management
- Security considerations and alternatives

This document serves as the technical foundation for ARO HCP CAPZ
development and provides guidance for customers deploying
ARO HCP clusters via CAPZ.
@willie-yao
Copy link
Contributor

/ok-to-test
/assign

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 10, 2026
@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.43%. Comparing base (530a8ab) to head (3949aeb).
⚠️ Report is 20 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6134   +/-   ##
=======================================
  Coverage   44.43%   44.43%           
=======================================
  Files         280      280           
  Lines       25367    25367           
=======================================
  Hits        11272    11272           
  Misses      13283    13283           
  Partials      812      812           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@willie-yao willie-yao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hard-work on this proposal and your patience @marek-veber! I think the amount of detail here is great and gives us a good understanding of ARO HCP integration in CAPZ. I do have a few questions and concerns about the proposal that came up previously in our team discussions:

  1. Relationship to AzureASOManagedControlPlane: I like that the proposal is using an ASO-native architecture like AzureASOManagedControlPlane. However, it looks like there are extra fields in these CRDs on top of the ASO resources. Are those required to make this work, and this can't be used directly with the existing AzureASOManagedControlPlane? If that's the case, can you ad an "Alternatives Considered" section explaining why this is the case? Understanding what's ARO-specific enough to require separate CRDs would help us evaluate the tradeoff of maintaining a parallel set of types.
  2. Custom Azure SDK usage: The keyvaults service using direct Azure SDK calls is the main area where this diverges from our target architecture. Is there an ASO issue tracking key version retrieval support? If so, could the proposal include a plan to remove this service once ASO covers it?
  3. Maintenance ownership: For new controller logic like this, we need an explicit statement about who owns long-term maintenance. Could you add a "Maintenance and Ownership" section clarifying that the ARO team will maintain these controllers and that the CAPZ core team is not expected to own ARO-specific logic?


### KeyVault Service and Encryption Key Management

While most resources are managed via ASO, the **keyvaults service** handles encryption key management when `identityRef` is set:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direct Azure SDK usage for key management is the one area where this diverges from the ASO-native pattern. I understand ASO doesn't currently have a CRD for Key Vault keys (only the Vault itself), so this seems necessary for the auto-creation path. Could you note in the proposal that this is a gap in ASO's coverage and link to an upstream ASO issue (or file one) requesting key management CRDs? That way we have a clear path to removing this SDK dependency if/when ASO adds support.

- **Dependency Timing**: Proper condition checking prevents premature resource creation
- **Resource Cleanup**: Owner references ensure cascading deletion of related resources

## Graduation Criteria
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add criteria around CAPZ architectural alignment here? Something like:

  • Maintenance ownership documented and agreed upon
  • Plan to remove keyvaults SDK service when ASO supports key version retrieval
  • Acknowledgment that these APIs follow the deprecated-path pattern and CAPZ's long-term direction is toward pure ASO-native controllers

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from willie-yao. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Updates based on willie-yao's review comments:

- Update metadata: add reviewers, fix dates, change status to implementable
- Fix API versions: v1beta2 → v1beta1 throughout examples
- Add "Alternatives Considered" section explaining why separate CRDs are needed
- Add "Maintenance and Ownership" section clarifying ARO team ownership
- Document ASO migration plan with dependency chain
- Add references to ARO-HCP repository and API specifications
- Note API versions: 2024-06-10-preview (private) vs 2025-12-23-preview (public)
- Add ASO issue reference for key management gap
- Update graduation criteria with ownership and migration commitments

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/design Categorizes issue or PR as related to design. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

4 participants