-
Notifications
You must be signed in to change notification settings - Fork 12
Add new grafana dashboards for general kubernetes metrics for trial #2834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new grafana dashboards for general kubernetes metrics for trial #2834
Conversation
c464b49 to
5a5fc4b
Compare
|
I think this can be a
What's going on here? 😂
These should all be a part of the task and not next steps after this is merged. We should not merge something that hasn't been evaluated, and most importantly, tested. |
|
The 'Cluster' dropdown doesn't seem to work on my kind cluster, does it work in Real Clusters™ ? |
|
@Zash I'm going to take a deeper dive. :) |
5a5fc4b to
1a9bfd9
Compare
15508d5 to
cfdff11
Compare
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! These dashboards look gorgeous and will be a fine addition to our monitoring. Two small suggestions:
- Add the source for the dashboards to
helmfile.d/charts/grafana-dashboards/dashboards/README.md - It's a little hard to find the new dashboards in the list in the Grafana GUI. Maybe add some common tag to them?
@elastisys-staffan done and done! :) Thank you for the input and guidance, very much appreciated. |
4ef8b3f to
94a9522
Compare
rarescosma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: English
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
|
Are the @elastisys/goto-monitoring-stack folks happy with the changes? If so, I'll merge. |
Xartos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the dashboards and they look great! I really like the look and feel of them.
Have some questions though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do we still need the "Trivy Operator Dashboard" dashboard? Seems like this one is presenting the same information but in an argumently better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are truly equivalent but one is better, I guess the only harm in removing the older one would be breaking links, bookmarks, browser history search. Can Grafana do redirects? Or can we easily reuse the same ID and replace the old one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
breaking links, bookmarks, browser history search.
..and most likely the E2E suite as well 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Same here, would we want to remove some of the other "Prometheus - ..." dashboards or do we see value in having both?
Co-authored-by: Rareș Cosma <[email protected]>
b399807 to
3626f92
Compare
anders-elastisys
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new k8s-views dashboards looks really nice and are a great addition. 🚀
But regarding the trivy-operator and prometheus ones, I am unsure if we should add them, or if we should instead replace the old ones we have.
Personally I think the new Prometheus dashboard is missing e.g. time series per job, and also it includes CPU and memory usage for all pods, which I do not think fits in that dashboard?
The new trivy dashboards looks a lot better than the old one, however, they show different numbers for number of vulnerabilities per severity as the queries are a bit different, in our old dashboard it sums for unique images, while in the new one it will sum for each pod so it will show a lot more vulnerabilities. I think I prefer that we calculate vulnerabilities per image as in the old one, but I can see both cases being valid.
@viktor-f @anders-elastisys , So shall we to a trial of these until next sprint, to decide what to keep and not keep? |
Warning
This is a public repository. Ensure not to disclose:
What kind of PR is this?
Required: Mark one of the following that is applicable:
What does this PR do / why do we need this PR?
This PR adds a curated selection of Kubernetes dashboards from the open-source project dotdc/grafana-dashboards-kubernetes to improve the operator experience in Welkin’s Grafana setup.
These dashboards provide a clear, drill-down structure for day-to-day cluster analysis (Global → Namespaces → Nodes → Pods) and introduce additional monitoring surfaces for Prometheus and Trivy Operator. They complement the existing dashboards from
kube-prometheus-stack/kubernetes-mixinwithout duplicating their functionality.Because the chart already auto-discovers dashboards placed under
dashboards/**, no configuration changes were required. The dashboards are provisioned automatically via the existing ConfigMap template and Grafana sidecar.Changes made
Added the following dashboards under
helmfile.d/charts/grafana-dashboards/dashboards/:k8s-views-global-dashboard.jsonk8s-views-namespaces-dashboard.jsonk8s-views-nodes-dashboard.jsonk8s-views-pods-dashboard.jsonk8s-addons-prometheus-dashboard.jsonk8s-addons-trivy-operator-dashboard.jsonDeliberate exclusions
The following dashboards were intentionally not added:
k8s-system-api-server-dashboard.jsonk8s-system-coredns-dashboard.jsonThese duplicate the canonical dashboards already provided by
kube-prometheus-stackand would provide no additional value.Validation
All added dashboards have been:
helmfile template,Existing dashboards (API server, CoreDNS, mixin dashboards) continue to function without duplication.
Information for reviewers
To reproduce the verification steps:
Then check in Grafana:
Existing API server and CoreDNS dashboards should remain available.
Checklist