Skip to content

Conversation

nrb
Copy link

@nrb nrb commented Oct 29, 2024

This PR reverts the exclusion of ASO CRDs.

TODO:

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 29, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

openshift-ci bot commented Oct 29, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 29, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 29, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 29, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

  • Include ASO CRDs in kustomization
  • Include ASO CRDs in component ConfigMap

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable up until now 👍

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 29, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

  • Include ASO CRDs in kustomization
  • Include ASO CRDs in component ConfigMap
  • Disable CRD management in ASO Deployment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@nrb
Copy link
Author

nrb commented Oct 29, 2024

@damdo @RadekManak Since the diff's too big for GitHub to render, how do we go about editing the arguments on the ASO Deployment? Use a Kustomize file to layer over the defaults?

Currently, it is

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app.kubernetes.io/name: azure-service-operator
    app.kubernetes.io/version: v2.6.0
    cluster.x-k8s.io/provider: infrastructure-azure
    clusterctl.cluster.x-k8s.io: ""
    control-plane: controller-manager
  name: azureserviceoperator-controller-manager
  namespace: openshift-cluster-api
spec:
  replicas: 1
  selector:
    matchLabels:
      control-plane: controller-manager
  strategy: {}
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: manager
        target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
      creationTimestamp: null
      labels:
        aadpodidbinding: aso-manager-binding
        app.kubernetes.io/name: azure-service-operator
        app.kubernetes.io/version: v2.6.0
        control-plane: controller-manager
    spec:
      containers:
      - args:
        - --metrics-addr=:8080
        - --health-addr=:8081
        - --enable-leader-election
        - --v=2
        - --crd-pattern=${ADDITIONAL_ASO_CRDS:= }
        - --webhook-port=9443
        - --webhook-cert-dir=/tmp/k8s-webhook-server/serving-certs

From the help of the aso controller binary:

  -crd-management string
        Instructs the operator on how it should manage the Custom Resource Definitions. One of 'auto', 'none' (default "auto")
  -crd-pattern string
        Install these CRDs. CRDs already in the cluster will also always be upgraded.

So, ideally, we'd omit -crd-pattern entirely and add -crd-management=none.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 30, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@nrb
Copy link
Author

nrb commented Oct 30, 2024

I figured out an approach to updating the container arguments, however it relies on openshift/cluster-capi-operator#228 being present in manifests-gen before this can merge.

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me 👍

@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 6, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@nrb nrb marked this pull request as ready for review November 6, 2024 18:59
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 6, 2024
@openshift-ci openshift-ci bot requested review from damdo and JoelSpeed November 6, 2024 19:00
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 8, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@damdo
Copy link
Member

damdo commented Nov 9, 2024

/retest

2 similar comments
@damdo
Copy link
Member

damdo commented Nov 11, 2024

/retest

@bryan-cox
Copy link
Member

/retest

@bryan-cox
Copy link
Member

/test e2e-azure-serial

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2024
Copy link

openshift-ci bot commented Dec 18, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from nrb. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 18, 2024

@nrb: This pull request references OCPCLOUD-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This PR reverts the exclusion of ASO CRDs.

TODO:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2024
@nrb nrb changed the title OCPCLOUD-2642: Generate ASO CRDs for CAPZ installation OCPCLOUD-2642: Include ASO CRDs for CAPZ installation Dec 18, 2024
@nrb
Copy link
Author

nrb commented Dec 19, 2024

While the tests are passing here, the Deployment isn't able to actually create pods because h the image isn't in payload quite yet - that happens in openshift/cluster-capi-operator#235.

So we'll need that PR to merge before we can have a valid test for this behavior.

            "status": {
                "conditions": [
                    {
                        "lastTransitionTime": "2024-12-18T21:22:08Z",
                        "lastUpdateTime": "2024-12-18T21:22:08Z",
                        "message": "Deployment does not have minimum availability.",
                        "reason": "MinimumReplicasUnavailable",
                        "status": "False",
                        "type": "Available"
                    },
                    {
                        "lastTransitionTime": "2024-12-18T21:32:09Z",
                        "lastUpdateTime": "2024-12-18T21:32:09Z",
                        "message": "ReplicaSet \"azureserviceoperator-controller-manager-75678d9f88\" has timed out progressing.",
                        "reason": "ProgressDeadlineExceeded",
                        "status": "False",
                        "type": "Progressing"
                    }
                ],

@damdo
Copy link
Member

damdo commented Jan 7, 2025

It looks like openshift/cluster-capi-operator#235 merged now, so we should be able to see that succeeding now

@damdo
Copy link
Member

damdo commented Jan 7, 2025

/test e2e-azure-capi-techpreview

@damdo
Copy link
Member

damdo commented Jan 7, 2025

/retest

@nrb
Copy link
Author

nrb commented Jan 7, 2025

event happened 352 times, something is wrong: namespace/openshift-cluster-api node/ci-op-jxzpnrgk-40c0c-5mms6-worker-eastus3-j9tn8 pod/azureserviceoperator-controller-manager-b885ff67c-zjq4s hmsg/c8d2995934 - reason/Pulling Pulling image "registry.build10.ci.openshift.org/ci-op-jxzpnrgk/stable@sha256:a7aab3e27a07cc343b97c020c90985f249fa9d8adabf2cf3fa60a0f045ee0596" (12:26:31Z) result=reject }

This failure looks legitimate, probably means that the CAPZ deployment fails to start, too. Will check logs and CI to see if I can get that image.

@nrb
Copy link
Author

nrb commented Jan 7, 2025

The ASO log isn't in plaintext: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_cluster-api-provider-azure/322/pull-ci-openshift-cluster-api-provider-azure-master-e2e-azure-techpreview/1876569391852687360/artifacts/e2e-azure-techpreview/gather-extra/artifacts/pods/openshift-cluster-api_azureserviceoperator-controller-manager-b885ff67c-zjq4s_manager.log

Looking at the pods for ASO, I see this in the status:

      name: manager
      ready: false
      restartCount: 0
      started: false
      state:
        waiting:
          message: secret "aso-controller-settings" not found
          reason: CreateContainerConfigError

Double checking the rest of the deployment, I see that all the Azure environment variables are set to come from that secret, so I'll need some more patching on this to make sure our settings are used instead.

@nrb
Copy link
Author

nrb commented Jan 7, 2025

We'll need to either populate the aso-controller-settings secret with the information we already have for CAPZ/the Azure CCM, or update the ASO deployment to reference the same secret that those components use.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2025
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 17, 2025
@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 17, 2025
@nrb
Copy link
Author

nrb commented May 19, 2025

/lifecycle frozen
/remove-lifecycle rotten

Copy link

openshift-ci bot commented May 19, 2025

@nrb: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/lifecycle frozen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 19, 2025
Copy link

openshift-ci bot commented Jul 1, 2025

@nrb: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-techpreview da93f62 link false /test e2e-azure-techpreview
ci/prow/e2e-azure-serial-2of2 da93f62 link true /test e2e-azure-serial-2of2
ci/prow/e2e-azure-serial-1of2 da93f62 link true /test e2e-azure-serial-1of2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants