Skip to content

Conversation

@triviajon
Copy link
Contributor

@triviajon triviajon commented Nov 24, 2025

What does this PR do?

This PR adds users to use a wildcard "*" in the Kind field under .spec.features.kubeStateMetricsCore.collectCrMetrics of the DatadogAgent CR. This comes as a follow-up to the bugfix in datadog-agent/pull/43315, and is the Operator equivalent of this helm-charts/datadog PR.

Motivation

What inspired you to submit this pull request?
https://datadoghq.atlassian.net/browse/CONTINT-4924

Additional Notes

Minimum Agent Versions

Are there minimum versions of the Datadog Agent and/or Cluster Agent required?

Relies on the bug-fix in datadog-agent that likely won't be included until 7.74.0 for the KSM check not to fail, but otherwise no.

Describe your test plan

Added a unit-test to feature/kubernetesstatecore/rbac_test.go.

Also QA'd manually:

  1. Created a minikube cluster and deployed a new build of the datadog operator: helm install datadog-operator /Users/jon.rosario/dd/helm-charts/charts/datadog-operator -f deploy-ddo.yaml --set clusterRole.allowReadAllResources=true:
image:
  repository: docker.io/jonrosario774/operator
  tag: 1.21.0-test.2

Note: if using make deploy, the operator clusterrole needs to be patched manually to add read to all resources

  1. Deployed datadog agent using the operator, configured for KSM to collect CR metrics:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
...
  features:
    kubeStateMetricsCore:
      enabled: true
      collectCrMetrics:
        - groupVersionKind:
            group: stable.example.com
            version: "*"
            kind: "*"
          labelsFromPath:
            name: ["metadata", "name"]
            namespace: ["metadata", "namespace"]
          metrics:
            - name: crontab_info
              help: "Basic info about CronTab CRs"
              each:
                type: Info
                info:
                  path: ["spec"]
                  labelsFromPath:
                    cronSpec: ["cronSpec"]
                    image: ["image"]
                    replicas: ["replicas"]
...

Note: this requires image override for runners or DCA if not using runners 7.74+ for QA

  1. Applied the CronTab CRD and CR, then saw the metrics being reported to Datadog.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.stable.example.com
spec:
  group: stable.example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                cronSpec:
                  type: string
                image:
                  type: string
                replicas:
                  type: integer
              required: ["image"]
  scope: Namespaced
  names:
    plural: crontabs
    singular: crontab
    kind: CronTab
    shortNames:
    - ct
---
apiVersion: stable.example.com/v1
kind: CronTab
metadata:
  name: my-cronjob
spec:
  cronSpec: "* * * * */5"
  image: busybox
  replicas: 2
image

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label

@triviajon triviajon requested review from a team as code owners November 24, 2025 15:54
@triviajon triviajon added enhancement New feature or request qa/skip-qa labels Nov 24, 2025
@triviajon triviajon marked this pull request as draft November 24, 2025 15:59
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 37.28%. Comparing base (d7277db) to head (ac54b50).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2350   +/-   ##
=======================================
  Coverage   37.27%   37.28%           
=======================================
  Files         290      290           
  Lines       24707    24710    +3     
=======================================
+ Hits         9210     9213    +3     
  Misses      14784    14784           
  Partials      713      713           
Flag Coverage Δ
unittests 37.28% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...r/datadogagent/feature/kubernetesstatecore/rbac.go 100.00% <100.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7277db...ac54b50. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@triviajon triviajon changed the title Support for wildcards in Kind field in KSM RBAC [CONTINT-4924] Support for wildcards in Kind field in KSM RBAC Nov 24, 2025
@triviajon triviajon marked this pull request as ready for review November 24, 2025 16:28
@tbavelier tbavelier modified the milestone: v1.22.0 Nov 28, 2025
Copy link
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in PR description, this requires using 7.74+, so QA can be done with override.clusterChecksRunner.image.name: docker.io/datadog/agent-dev:nightly-main-py3 if using runners, or docker.io/datadog/cluster-agent-dev:master if not using runners.
Also for QA, this indeed requires the readallresources option from the Helm chart template. Or patching directly the clusterrole datadog-operator-manager-role if using make deploy


@triviajon we should probably wait to merge this until 7.74 is released, so possibly with operator 1.23, as 1.22 release cycle will start in mid december

@fanny-jiang fanny-jiang modified the milestones: v1.22.0, v1.23.0 Dec 17, 2025
@tbavelier
Copy link
Member

/merge

@dd-devflow-routing-codex
Copy link

dd-devflow-routing-codex bot commented Dec 18, 2025

View all feedbacks in Devflow UI.

2025-12-18 14:03:02 UTC ℹ️ Start processing command /merge


2025-12-18 14:03:08 UTC ℹ️ MergeQueue: waiting for PR to be ready

This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2025-12-18 14:33:23 UTC ℹ️ MergeQueue: merge request added to the queue

The expected merge time in main is approximately 2h (p90).


2025-12-18 15:37:11 UTCMergeQueue: The checks failed on this merge request

Tests failed on this commit bf43647:

What to do next?

  • Investigate the failures and when ready, re-add your pull request to the queue!
  • If your PR checks are green, try to rebase/merge. It might be because the CI run is a bit old.
  • Any question, go check the FAQ.

@tbavelier tbavelier merged commit 7780750 into main Dec 18, 2025
52 of 53 checks passed
@tbavelier tbavelier deleted the triviajon/CONTINT-4924/op-rbac-generation-wildcard-kind branch December 18, 2025 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants