-
Notifications
You must be signed in to change notification settings - Fork 612
[Feature] Include CR UID in kuberay metrics #4003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
cc @troychiu @win5923 @owenowenisme Would you mind reviewing this PR? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, It looks pretty nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a screenshot with UID in kuberay metrics? this will help a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I will open a follow up PR for ray doc and Grafana Dashboard.
Ping @troychiu for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #3923 has been merged, I think it would be better to include uid when we reset metrics.
numCleanedUpMetrics := r.rayClusterProvisionedDurationSeconds.DeletePartialMatch(prometheus.Labels{"name": name, "namespace": namespace}) |
Hi! I've reviewed the metrics cleanup PR, and I'd like to discuss whether modifications are necessary. In the metrics cleanup PR, whenever a CR finishes, it cleans up its metrics using the name and namespace. This approach seems sufficient since only one CR with the same name and namespace can exist, so there shouldn't be a need for additional UID-based differentiation. My PR simply appends the UID value to each metric for distinction purposes and should be decoupled from the metrics cleanup feature. Perhaps we could address this in a follow-up PR to keep the concerns separated. |
I think what you are saying is correct, but I am not sure if there is any difference in performance. Are you aware of any? IMO, It won't be too much work but I am fine with a follow-up PR. |
Hi! Thanks for your feedback! 👍 Here are my thoughts:
|
Why are these changes needed?
This PR adds the Custom Resource (CR) UID field to KubeRay metrics to distinguish between custom resources with the same name. Previously, metrics only included
name
andnamespace
labels, which could cause ambiguity when a RayCluster or RayJob is deleted and later recreated with the same name in the same namespace.The key improvements include:
uid
label to all KubeRay metrics (RayCluster, RayJob and RayService metrics)This change enables users to:
Related issue number
Closes #3754
Testing Results
I have tested this feature with actual RayCluster and RayJob resources. Here are some examples from
curl http://0.0.0.0:8080/metrics | grep uid
:RayCluster metrics with UID:
RayJob metrics with UID:
Checks