-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: introduce deletion timestamp metric for multiple resources #2678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: introduce deletion timestamp metric for multiple resources #2678
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: IgorIgnatevBolt The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
63191f9
to
d5bb362
Compare
All commits were squashed into one. |
Hi, could you share more insights on use cases after these metrics are added? Is it used for monitoring Kubernetes resources that are stuck in a terminating state? |
@CatherineF-dev Hi, yes, if the resource deletion process is stuck for some reason or blocked by the finalizer, deletiontimestamp metric can help to detect such a case and raise an alert for investigation. |
/assign |
@IgorIgnatevBolt How will we know which resource should be deleted? |
Maybe I misunderstood the question, but this PR is exactly about detection for such resources that were nominated by the controller manager for deletion but not deleted for some reason, eq blocked by finalizers
|
Hi @CatherineF-dev, do you need any more information about PR or anything else that can help you move forward? |
/assign @CatherineF-dev |
| kube_deployment_labels | Gauge | Kubernetes labels converted to Prometheus labels controlled via [--metric-labels-allowlist](../../developer/cli-arguments.md) | `deployment`=<deployment-name> <br> `namespace`=<deployment-namespace> <br> `label_DEPLOYMENT_LABEL`=<DEPLOYMENT_LABEL> | STABLE | | ||
| kube_deployment_created | Gauge | | `deployment`=<deployment-name> <br> `namespace`=<deployment-namespace> | STABLE | | ||
| kube_deployment_created | Gauge | | `deployment`=<deployment-name> <br> `namespace`=<deployment-namespace> | STABLE | | ||
| kube_deployment_deletion_timestamp | Gauge | Unix deletion timestamp | `deployment`=<deployment-name> <br> `namespace`=<deployment-namespace> | EXPIREMENTAL | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use kube_deployment_deleted to align with kube_deployment_created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep the pattern the same as for other resources like kube_node_deletion_timestamp
or kube_pod_deletion_timestamp
What this PR does / why we need it:
Some resources can be blocked by deletion from
finalizers
. To catch this and expose it to metrics, we can use the deletion timestamp metadata field.Introduce a deletion_timestamp metric for the next resources:
kube_deployment_deletion_timestamp
kube_statefulset_deletion_timestamp
kube_daemonset_deletion_timestamp
kube_service_deletion_timestamp
kube_poddisruptionbudget_deletion_timestamp
Also formatting tables in docs
How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #