|
| 1 | +--- |
| 2 | +title: Kubernetes 组件 SLI 指标 |
| 3 | +linkTitle: 服务水平指示器指标 |
| 4 | +content_type: reference |
| 5 | +weight: 20 |
| 6 | +--- |
| 7 | +<!-- |
| 8 | +reviewers: |
| 9 | +- logicalhan |
| 10 | +title: Kubernetes Component SLI Metrics |
| 11 | +linkTitle: Service Level Indicator Metrics |
| 12 | +content_type: reference |
| 13 | +weight: 20 |
| 14 | +--> |
| 15 | + |
| 16 | +<!-- overview --> |
| 17 | + |
| 18 | +{{< feature-state for_k8s_version="v1.26" state="alpha" >}} |
| 19 | + |
| 20 | +<!-- |
| 21 | +As an alpha feature, Kubernetes lets you configure Service Level Indicator (SLI) metrics |
| 22 | +for each Kubernetes component binary. This metric endpoint is exposed on the serving |
| 23 | +HTTPS port of each component, at the path `/metrics/slis`. You must enable the |
| 24 | +`ComponentSLIs` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) |
| 25 | +for every component from which you want to scrape SLI metrics. |
| 26 | +--> |
| 27 | +作为一个 Alpha 特性,Kubernetes 允许你为每个 Kubernetes 组件二进制文件配置服务水平指示器 (SLI) 指标。 |
| 28 | +此指标端点被暴露在每个组件提供 HTTPS 服务的端口上,路径为 `/metrics/slis`。 |
| 29 | +你必须为想要抓取 SLI 指标的每个组件启用 `ComponentSLIs` |
| 30 | +[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)。 |
| 31 | + |
| 32 | +<!-- body --> |
| 33 | + |
| 34 | +<!-- |
| 35 | +## SLI Metrics |
| 36 | +
|
| 37 | +With SLI metrics enabled, each Kubernetes component exposes two metrics, |
| 38 | +labeled per healthcheck: |
| 39 | +
|
| 40 | +- a gauge (which represents the current state of the healthcheck) |
| 41 | +- a counter (which records the cumulative counts observed for each healthcheck state) |
| 42 | +--> |
| 43 | +## SLI 指标 {#sli-metrics} |
| 44 | + |
| 45 | +启用 SLI 指标时,每个 Kubernetes 组件暴露两个指标,按照健康检查添加标签: |
| 46 | + |
| 47 | +- 计量值(表示健康检查的当前状态) |
| 48 | +- 计数值(记录观察到的每个健康检查状态的累计次数) |
| 49 | + |
| 50 | +<!-- |
| 51 | +You can use the metric information to calculate per-component availability statistics. |
| 52 | +For example, the API server checks the health of etcd. You can work out and report how |
| 53 | +available or unavailable etcd has been - as reported by its client, the API server. |
| 54 | +
|
| 55 | +The prometheus gauge data looks like this: |
| 56 | +--> |
| 57 | +你可以使用此指标信息计算每个组件的可用性统计信息。例如,API 服务器检查 etcd 的健康。 |
| 58 | +你可以计算并报告 etcd 的可用或不可用情况,具体由其客户端(即 API 服务器)进行报告。 |
| 59 | + |
| 60 | +Prometheus 计量表数据看起来类似于: |
| 61 | + |
| 62 | +``` |
| 63 | +# HELP kubernetes_healthcheck [ALPHA] This metric records the result of a single healthcheck. |
| 64 | +# TYPE kubernetes_healthcheck gauge |
| 65 | +kubernetes_healthcheck{name="autoregister-completion",type="healthz"} 1 |
| 66 | +kubernetes_healthcheck{name="autoregister-completion",type="readyz"} 1 |
| 67 | +kubernetes_healthcheck{name="etcd",type="healthz"} 1 |
| 68 | +kubernetes_healthcheck{name="etcd",type="readyz"} 1 |
| 69 | +kubernetes_healthcheck{name="etcd-readiness",type="readyz"} 1 |
| 70 | +kubernetes_healthcheck{name="informer-sync",type="readyz"} 1 |
| 71 | +kubernetes_healthcheck{name="log",type="healthz"} 1 |
| 72 | +kubernetes_healthcheck{name="log",type="readyz"} 1 |
| 73 | +kubernetes_healthcheck{name="ping",type="healthz"} 1 |
| 74 | +kubernetes_healthcheck{name="ping",type="readyz"} 1 |
| 75 | +``` |
| 76 | + |
| 77 | +<!-- |
| 78 | +While the counter data looks like this: |
| 79 | +--> |
| 80 | +而计数器数据看起来类似于: |
| 81 | + |
| 82 | +``` |
| 83 | +# HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthcheck. |
| 84 | +# TYPE kubernetes_healthchecks_total counter |
| 85 | +kubernetes_healthchecks_total{name="autoregister-completion",status="error",type="readyz"} 1 |
| 86 | +kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="healthz"} 15 |
| 87 | +kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="readyz"} 14 |
| 88 | +kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15 |
| 89 | +kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15 |
| 90 | +kubernetes_healthchecks_total{name="etcd-readiness",status="success",type="readyz"} 15 |
| 91 | +kubernetes_healthchecks_total{name="informer-sync",status="error",type="readyz"} 1 |
| 92 | +kubernetes_healthchecks_total{name="informer-sync",status="success",type="readyz"} 14 |
| 93 | +kubernetes_healthchecks_total{name="log",status="success",type="healthz"} 15 |
| 94 | +kubernetes_healthchecks_total{name="log",status="success",type="readyz"} 15 |
| 95 | +kubernetes_healthchecks_total{name="ping",status="success",type="healthz"} 15 |
| 96 | +kubernetes_healthchecks_total{name="ping",status="success",type="readyz"} 15 |
| 97 | +``` |
| 98 | + |
| 99 | +<!-- |
| 100 | +## Using this data |
| 101 | +
|
| 102 | +The component SLIs metrics endpoint is intended to be scraped at a high frequency. Scraping |
| 103 | +at a high frequency means that you end up with greater granularity of the gauge's signal, which |
| 104 | +can be then used to calculate SLOs. The `/metrics/slis` endpoint provides the raw data necessary |
| 105 | +to calculate an availability SLO for the respective Kubernetes component. |
| 106 | +--> |
| 107 | +## 使用此类数据 {#using-this-data} |
| 108 | + |
| 109 | +组件 SLI 指标端点旨在以高频率被抓取。 |
| 110 | +高频率抓取意味着你最终会获得更细粒度的计量信号,然后可以将其用于计算 SLO。 |
| 111 | +`/metrics/slis` 端点为各个 Kubernetes 组件提供了计算可用性 SLO 所需的原始数据。 |
0 commit comments