-
Notifications
You must be signed in to change notification settings - Fork 5
add "evicted_pods_total" metric #166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
maxlaverse
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pznamensky!
Thanks for your contribution. Having a metric is a good addition.
I left a few comments because I think we can really simplify the implementation.
pkg/metrics_exporter.go
Outdated
| return noopMetricsRecorder{} | ||
| } | ||
|
|
||
| if strings.TrimSpace(opts.MetricsBindAddress) == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about doing value validation directly in the urfave/cli in main.go with something like:
Before: func(ctx *cli.Context) error {
if opts.EnableMetrics {
bindAddr := strings.TrimSpace(opts.MetricsBindAddress)
if bindAddr == "" {
return fmt.Errorf("--metrics-bind-address cannot be empty when --enable-metrics is true")
}
if _, _, err := net.SplitHostPort(bindAddr); err != nil {
return fmt.Errorf("invalid --metrics-bind-address: %w", err)
}
}
return nil
},
pkg/metrics_exporter.go
Outdated
| } | ||
|
|
||
| func (r *prometheusMetricsRecorder) Start(ctx context.Context) { | ||
| if r.server == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r.server == nil will only happens if someone made a buggy change. Since this is all internal and a very small project, I think I would just not have this check. It's unlikely to be a problem, and should probably better crash hard right away.
pkg/metrics_exporter.go
Outdated
| shutdownCtx, cancel := context.WithTimeout(context.Background(), metricsShutdownGrace) | ||
| defer cancel() | ||
|
|
||
| if err := r.server.Shutdown(shutdownCtx); err != nil && !errors.Is(err, http.ErrServerClosed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really possible for this method to return a http.ErrServerClosed error ?
pkg/metrics_exporter.go
Outdated
| recorder.server = &http.Server{ | ||
| Addr: opts.MetricsBindAddress, | ||
| Handler: mux, | ||
| ReadHeaderTimeout: 5 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could make sense to have this slightly lower than the shutdown timeout I think, to rather have the client get a timeout error instead of seeing our server shutdown logging errors.
Maybe worth bringing it on top as a constant to have both side by side
pkg/metrics_exporter.go
Outdated
| } | ||
|
|
||
| func (r *prometheusMetricsRecorder) RecordPodEviction(namespace, appName, appInstance string) { | ||
| if r.counter == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for server, I wouldn't guard here and just keep the code simple. Same below
pkg/metrics_exporter.go
Outdated
| r.counter.EnsureMetric(namespace, appName, appInstance) | ||
| } | ||
|
|
||
| func (r *prometheusMetricsRecorder) handleMetrics(w http.ResponseWriter, req *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if you know, but Prometheus provides a promhttp.Handler() which handles all this already.
pkg/metrics_exporter.go
Outdated
| } | ||
| } | ||
|
|
||
| func (c *evictedPodCounter) Inc(namespace, appName, appInstance string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there is a specific need for managing this by hand, have a look at this guide: https://prometheus.io/docs/guides/go-application/
For classic counters, there is a much easier way to achieve this. You could remove the noopRecorder even, and just start the metric server conditionally.
pkg/controller.go
Outdated
| } | ||
|
|
||
| func appNameLabel(pod *corev1.Pod) string { | ||
| if pod == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't guard again pod being nil. If pod is nil, the code would crash where this method is called already. Since this is only used for metrics, maybe we could both the two methods in metrics_exporter.go and just pass to pod to RecordPodEviction ?
pkg/metrics_exporter.go
Outdated
| r.counter.Inc(namespace, appName, appInstance) | ||
| } | ||
|
|
||
| func (r *prometheusMetricsRecorder) RecordObservedNamespace(namespace, appName, appInstance string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really just about namespace right ? It's to initialize metrics for a given Pod to 0 ?
If so, is it really needed (haven't done Prometheus queries in a while). In a cluster with a thousands Pods, you'll have a thousand time series even if only 1-2 Pods are evicted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, but not exactly.
What we needed is to indeed initialize a metric with 0.
However, as you correctly pointed out, it would be too many series if we make it per pod.
Initially I just made this per namespace, but then we realize that's not enough for us.
What I came up with is adding these common pod labels (see also: link):
app.kubernetes.io/nameapp.kubernetes.io/instance
That's not an ideal solution, as these labels are not defined in all pods, but for most cases it seems works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about renaming it it RecordPodExistence then
|
@maxlaverse, thanks for your review. |
maxlaverse
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides two remarks about naming it looks good to me. Thanks for the taking the time.
I'll try it out.
| metricsShutdownGrace = 5 * time.Second | ||
| metricsReadHeaderTimeout = 4 * time.Second | ||
|
|
||
| metricLabelNamespace = "affected_namespace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would remove the "affected_" prefix from all those tags, because it's a bit weird when the metric's value is 0 (it isn't really affected by the controller), and the metric will always only record Pods being affected. wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I added the affected_ prefix is that in default setup Prometheus adds app_kubernetes_io_instance and app_kubernetes_io_name. So we need to distinguish between them somehow.
For example:
soft_pod_memory_evicter_evicted_pods_total{affected_app_kubernetes_io_instance="api", affected_app_kubernetes_io_name="api", affected_namespace="backend", app_kubernetes_io_instance="soft-pod-memory-evicter", app_kubernetes_io_name="soft-pod-memory-evicter", environment="production", instance="192.168.194.80:9288", job="kubernetes-pods", namespace="soft-pod-memory-evicter", node="ip-192-168-250-40.eu-west-1.compute.internal", pod="soft-pod-memory-evicter-5f85645ff7-q9wq6", pod_template_hash="5f85645ff7"}
It might make sense to rename prefix to something like target_ to avoid confusion. How does this sound to you?
pkg/metrics_exporter.go
Outdated
| r.counter.Inc(namespace, appName, appInstance) | ||
| } | ||
|
|
||
| func (r *prometheusMetricsRecorder) RecordObservedNamespace(namespace, appName, appInstance string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about renaming it it RecordPodExistence then
|
As for |
|
Thanks again for your contribution @pznamensky ! I'm going to merge the PR and eventually adjust it in subsequent commits |
Hi @maxlaverse,
We use
soft-pod-memory-evicterand it works fine for us. Thank you for it!The only thing we've found useful is to add
evicted_pods_totalmetric to get alerts if something goes crazy with our apps.This is not something anybody asked (according to the issues in the repo).
The patchset also has been almost completely generated by a LLM (it has been reviewed by a human though) and it has been working for a while in our production environment.
If you think it worth merging, can you please take a look at the PR, please?