Skip to content

Conversation

@Techassi
Copy link
Member

Part of stackabletech/secret-operator#634 and https://github.com/stackabletech/decisions/issues/62.

This PR adds a new CRD maintainer which reconciles the CRDs when the conversion webhook server certificate is rotated. It additionally provides a oneshot channel to indicate that the initial CRD rollout is done and default custom resources can be applied.

This is a crude first implementation of the CRD and default CR maintenance mechanism. This PR will be used as a test-balloon in stackabletech/secret-operator#634.

In the future, this mechanism should be implemented in a more generic way, similar to kube::Controller.

@Techassi Techassi self-assigned this Sep 23, 2025
@Techassi Techassi moved this to Development: In Progress in Stackable Engineering Sep 23, 2025
@Techassi Techassi changed the title feat: Add CRD maintainer feat!: Add CRD maintainer Sep 23, 2025
@Techassi Techassi marked this pull request as ready for review September 24, 2025 09:08
@Techassi Techassi marked this pull request as draft September 24, 2025 09:36
@Techassi Techassi force-pushed the feat/crd-maintenance branch from 6c305d8 to 20656a6 Compare September 25, 2025 10:12
@Techassi Techassi force-pushed the feat/crd-maintenance branch from 20656a6 to 2fc5c28 Compare September 25, 2025 10:14
@Techassi Techassi marked this pull request as ready for review September 25, 2025 10:18
@Techassi Techassi moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Sep 25, 2025
@sbernauer sbernauer moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Sep 30, 2025
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any example code in secret-op I can look at?
At a first glance it looks like stackabletech/secret-operator#634 is not using it yet

@Techassi Techassi requested a review from sbernauer October 1, 2025 12:16
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Techassi pushed his code, so we have a working example in secret-op.

What I don't like is that the webhook and the maintainer are not integrated into each other and we basically need to configure everything twice, which is annoying and error-prone:

pub async fn conversion_webhook(
    operator_environment: &OperatorEnvironmentOptions,
) -> anyhow::Result<(ConversionWebhookServer, mpsc::Receiver<Certificate>)> {
    let crds_and_handlers = [
        (
            SecretClass::merged_crd(SecretClassVersion::V1Alpha2)?,
            SecretClass::try_convert as fn(_) -> _,
        ),
        (
            TrustStore::merged_crd(TrustStoreVersion::V1Alpha1)?,
            TrustStore::try_convert as fn(_) -> _,
        ),
    ];

    let options = ConversionWebhookOptions {
        socket_addr: ConversionWebhookServer::DEFAULT_SOCKET_ADDRESS,
        namespace: operator_environment.operator_namespace.clone(),
        service_name: operator_environment.operator_service_name.clone(),
    };

    Ok(ConversionWebhookServer::new(crds_and_handlers, options).await?)
}

and further down

            let (maintainer, initial_reconcile_rx) = CustomResourceDefinitionMaintainer::new(
                client.as_kube_client(),
                certificate_rx,
                [
                    SecretClass::merged_crd(SecretClassVersion::V1Alpha2).unwrap(),
                    TrustStore::merged_crd(TrustStoreVersion::V1Alpha1).unwrap(),
                ],
                CustomResourceDefinitionMaintainerOptions {
                    operator_service_name: operator_environment.operator_service_name,
                    operator_namespace: operator_environment.operator_namespace,
                    field_manager: OPERATOR_NAME.to_owned(),
                    webhook_https_port: WebhookServer::DEFAULT_HTTPS_PORT,
                    disabled: maintenance.disable_crd_maintenance,
                },
            );

We duplicate:

  • The listen port
  • operator namespace
  • operator service name
  • The list of CRDS. What happens when I add a CRD to one but not hte other?
  • What stored version the CRDs should use. What happens if we use different onces?
  • We need to know that there is a channel of CA certs in between and need to wire it up

I'd say all of this can be gathered from the webhook.
I'm thinking of a API similar to this (one code place instead of two - keep main.rs clean!)

pub async fn conversion_webhook_and_crd_maintainer(
    client: Client,
    operator_environment: &OperatorEnvironmentOptions,
    maintenance: &MaintenanceOptions,
) -> anyhow::Result<(
    ConversionWebhookServer,
    CustomResourceDefinitionMaintainer,
    oneshot::Receiver<()>,
)> {
    let crds_and_handlers = [
        (
            SecretClass::merged_crd(SecretClassVersion::V1Alpha2)?,
            SecretClass::try_convert as fn(_) -> _,
        ),
        (
            TrustStore::merged_crd(TrustStoreVersion::V1Alpha1)?,
            TrustStore::try_convert as fn(_) -> _,
        ),
    ];

    let options = ConversionWebhookOptions {
        listen_addr: ConversionWebhookServer::DEFAULT_LISTEN_ADDRESS,
        listen_port: ConversionWebhookServer::DEFAULT_HTTPS_PORT,
        namespace: operator_environment.operator_namespace.clone(),
        service_name: operator_environment.operator_service_name.clone(),
    };

    let conversion_webhook = ConversionWebhookServer::new(crds_and_handlers, options).await?;
    let (crd_maintainer, initial_reconcile_rx) =
        CustomResourceDefinitionMaintainer::for_conversion_webhook(
            &conversion_webhook,
            client,
            OPERATOR_NAME.to_owned(),
            maintenance.disable_crd_maintenance,
        );

    Ok((conversion_webhook, crd_maintainer, initial_reconcile_rx))
}

We only specify everything once.

@Techassi Techassi force-pushed the feat/crd-maintenance branch from f5a1841 to 99eae00 Compare October 8, 2025 14:46
@Techassi
Copy link
Member Author

Techassi commented Oct 8, 2025

I had similar ideas to what is mentioned in #1099 (review). Commit 99eae00 implements a simplified API to construct a webhook server in combination with a CRD maintainer.

This follow-up commit uses the improved code: stackabletech/secret-operator@788dd08 (#634)

@Techassi Techassi requested a review from sbernauer October 8, 2025 14:49
Co-authored-by: Sebastian Bernauer <[email protected]>
@Techassi Techassi requested a review from sbernauer October 13, 2025 12:46
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was it. Calling code in stackabletech/secret-operator#634 looks very nice and clean

Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on behalf of @sbernauer (given that the review comments are adequately resolved)

@Techassi Techassi enabled auto-merge October 14, 2025 10:19
@Techassi Techassi dismissed sbernauer’s stale review October 14, 2025 10:20

All requested changes have been implemented.

@Techassi Techassi added this pull request to the merge queue Oct 14, 2025
@Techassi Techassi moved this from Development: In Review to Development: Done in Stackable Engineering Oct 14, 2025
Merged via the queue into main with commit 5b7a4f6 Oct 14, 2025
8 checks passed
@Techassi Techassi deleted the feat/crd-maintenance branch October 14, 2025 10:25
@lfrancke lfrancke moved this from Development: Done to Done in Stackable Engineering Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

4 participants