Skip to content

Commit 3858e94

Browse files
markgoddardmmalchuk
authored andcommitted
cadvisor: Set housekeeping interval to Prometheus scrape interval
The prometheus_cadvisor container has high CPU usage. On various production systems I checked it sits around 13-16% on controllers, averaged over the prometheus 1m scrape interval. When viewed with top we can see it is a bit spikey and can jump over 100%. There are various bugs about this, but I found google/cadvisor#2523 which suggests reducing the per-container housekeeping interval. This defaults to 1s, which provides far greater granularity than we need with the default prometheus scrape interval of 60s. Reducing the housekeeping interval to 60s on a production controller reduced the CPU usage from 13% to 3.5% average. This still seems high, but is more reasonable. Change-Id: I89c62a45b1f358aafadcc0317ce882f4609543e7 Closes-Bug: #2048223 (cherry picked from commit 97e5c0e)
1 parent 09ca4bd commit 3858e94

File tree

2 files changed

+8
-1
lines changed

2 files changed

+8
-1
lines changed

ansible/roles/prometheus/defaults/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -318,7 +318,7 @@ prometheus_openstack_exporter_disabled_lb: "{{ '--disable-service.load-balancer
318318
prometheus_openstack_exporter_disabled_items: "{{ [prometheus_openstack_exporter_disabled_volume, prometheus_openstack_exporter_disabled_dns, prometheus_openstack_exporter_disabled_object, prometheus_openstack_exporter_disabled_lb | trim] | join(' ') | trim }}"
319319

320320
prometheus_blackbox_exporter_cmdline_extras: ""
321-
prometheus_cadvisor_cmdline_extras: "--docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process"
321+
prometheus_cadvisor_cmdline_extras: "--docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process --housekeeping_interval={{ prometheus_scrape_interval }}"
322322
prometheus_elasticsearch_exporter_cmdline_extras: ""
323323
prometheus_haproxy_exporter_cmdline_extras: ""
324324
prometheus_memcached_exporter_cmdline_extras: ""
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
fixes:
3+
- |
4+
Fixes an issue with high CPU usage of the cAdvisor container by setting the
5+
per-container housekeeping interval to the same value as the Prometheus
6+
scrape interval. `LP#2048223
7+
<https://bugs.launchpad.net/kolla-ansible/+bug/2048223>`__

0 commit comments

Comments
 (0)