|
2 | 2 |
|
3 | 3 | ## master / unreleased
|
4 | 4 |
|
| 5 | +## 1.11.0 / 2021-12-30 |
| 6 | + |
5 | 7 | * [CHANGE] Store gateway: set `-blocks-storage.bucket-store.index-cache.memcached.max-get-multi-concurrency`,
|
6 | 8 | `-blocks-storage.bucket-store.chunks-cache.memcached.max-get-multi-concurrency`,
|
7 | 9 | `-blocks-storage.bucket-store.metadata-cache.memcached.max-get-multi-concurrency`,
|
8 | 10 | `-blocks-storage.bucket-store.index-cache.memcached.max-idle-connections`,
|
9 | 11 | `-blocks-storage.bucket-store.chunks-cache.memcached.max-idle-connections`,
|
10 | 12 | `-blocks-storage.bucket-store.metadata-cache.memcached.max-idle-connections` to 100 #414
|
11 | 13 | * [CHANGE] Update grafana-builder dependency: use $__rate_interval in qpsPanel and latencyPanel. #372
|
12 |
| -* [CHANGE] `namespace` template variable in dashboards now only selects namespaces for selected clusters. #311 |
13 |
| -* [CHANGE] Alertmanager: mounted overrides configmap to alertmanager too. #315 |
14 |
| -* [CHANGE] Memcached: upgraded memcached from `1.5.17` to `1.6.9`. #316 |
15 |
| -* [CHANGE] `CortexIngesterRestarts` alert severity changed from `critical` to `warning`. #321 |
16 |
| -* [CHANGE] Store-gateway: increased memory request and limit respectively from 6GB / 6GB to 12GB / 18GB. #322 |
17 |
| -* [CHANGE] Store-gateway: increased `-blocks-storage.bucket-store.max-chunk-pool-bytes` from 2GB (default) to 12GB. #322 |
18 |
| -* [CHANGE] Dashboards: added overridable `job_labels` and `cluster_labels` to the configuration object as label lists to uniquely identify jobs and clusters in the metric names and group-by lists in dashboards. #319 |
19 |
| -* [CHANGE] Dashboards: `alert_aggregation_labels` has been removed from the configuration and overriding this value has been deprecated. Instead the labels are now defined by the `cluster_labels` list, and should be overridden accordingly through that list. #319 |
20 |
| -* [CHANGE] Ingester/Ruler: set `-server.grpc-max-send-msg-size-bytes` and `-server.grpc-max-send-msg-size-bytes` to sensible default values (10MB). #326 |
21 |
| -* [CHANGE] Renamed `CortexCompactorHasNotUploadedBlocksSinceStart` to `CortexCompactorHasNotUploadedBlocks`. #334 |
22 |
| -* [CHANGE] Renamed `CortexCompactorRunFailed` to `CortexCompactorHasNotSuccessfullyRunCompaction`. #334 |
23 |
| -* [CHANGE] Renamed `CortexInconsistentConfig` alert to `CortexInconsistentRuntimeConfig` and increased severity to `critical`. #335 |
24 |
| -* [CHANGE] Increased `CortexBadRuntimeConfig` alert severity to `critical` and removed support for `cortex_overrides_last_reload_successful` metric (was removed in Cortex 1.3.0). #335 |
25 |
| -* [CHANGE] Grafana 'min step' changed to 15s so dashboard show better detail. #340 |
26 |
| -* [CHANGE] Replace `CortexRulerFailedEvaluations` with two new alerts: `CortexRulerTooManyFailedPushes` and `CortexRulerTooManyFailedQueries`. #347 |
27 |
| -* [CHANGE] Removed `CortexCacheRequestErrors` alert. This alert was not working because the legacy Cortex cache client instrumentation doesn't track errors. #346 |
28 |
| -* [CHANGE] Removed `CortexQuerierCapacityFull` alert. #342 |
29 |
| -* [CHANGE] Changes blocks storage alerts to group metrics by the configured `cluster_labels` (supporting the deprecated `alert_aggregation_labels`). #351 |
30 |
| -* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` critical alert threshold from 80% to 85%. #363 |
31 | 14 | * [CHANGE] Decreased `-server.grpc-max-concurrent-streams` from 100k to 10k. #369
|
32 | 15 | * [CHANGE] Decreased blocks storage ingesters graceful termination period from 80m to 20m. #369
|
33 | 16 | * [CHANGE] Changed default `job_names` for query-frontend, query-scheduler and querier to match custom deployments too. #376
|
|
45 | 28 | * [CHANGE] Disabled step alignment in query-frontend to be compliant with PromQL. #420
|
46 | 29 | * [CHANGE] Do not limit compactor CPU and request a number of cores equal to the configured concurrency. #420
|
47 | 30 | * [ENHANCEMENT] Add overrides config to compactor. This allows setting retention configs per user. #386
|
48 |
| -* [ENHANCEMENT] cortex-mixin: Make `cluster_namespace_deployment:kube_pod_container_resource_requests_{cpu_cores,memory_bytes}:sum` backwards compatible with `kube-state-metrics` v2.0.0. #317 |
49 |
| -* [ENHANCEMENT] Cortex-mixin: Include `cortex-gw-internal` naming variation in default `gateway` job names. #328 |
50 |
| -* [ENHANCEMENT] Ruler dashboard: added object storage metrics. #354 |
51 |
| -* [ENHANCEMENT] Alertmanager dashboard: added object storage metrics. #354 |
52 |
| -* [ENHANCEMENT] Added documentation text panels and descriptions to reads and writes dashboards. #324 |
53 |
| -* [ENHANCEMENT] Dashboards: defined container functions for common resources panels: containerDiskWritesPanel, containerDiskReadsPanel, containerDiskSpaceUtilization. #331 |
54 |
| -* [ENHANCEMENT] cortex-mixin: Added `alert_excluded_routes` config to exclude specific routes from alerts. #338 |
55 |
| -* [ENHANCEMENT] Added `CortexMemcachedRequestErrors` alert. #346 |
56 |
| -* [ENHANCEMENT] Ruler dashboard: added "Per route p99 latency" panel in the "Configuration API" row. #353 |
57 |
| -* [ENHANCEMENT] Increased the `for` duration of the `CortexIngesterReachingSeriesLimit` warning alert to 3h. #362 |
58 |
| -* [ENHANCEMENT] Added a new tier (`medium_small_user`) so we have another tier between 100K and 1Mil active series. #364 |
59 |
| -* [ENHANCEMENT] Extend Alertmanager dashboard: #313 |
60 |
| - * "Tenants" stat panel - shows number of discovered tenant configurations. |
61 |
| - * "Replication" row - information about the replication of tenants/alerts/silences over instances. |
62 |
| - * "Tenant Configuration Sync" row - information about the configuration sync procedure. |
63 |
| - * "Sharding Initial State Sync" row - information about the initial state sync procedure when sharding is enabled. |
64 |
| - * "Sharding Runtime State Sync" row - information about various state operations which occur when sharding is enabled (replication, fetch, marge, persist). |
65 | 31 | * [ENHANCEMENT] Added 256MB memory ballast to querier. #369
|
66 | 32 | * [ENHANCEMENT] Update gsutil command for `not healthy index found` playbook #370
|
67 | 33 | * [ENHANCEMENT] Update `etcd-operator` to latest version (see https://github.com/grafana/jsonnet-libs/pull/480). #263
|
|
88 | 54 | * `cortex_ruler_allow_multiple_replicas_on_same_node`
|
89 | 55 | * `cortex_querier_allow_multiple_replicas_on_same_node`
|
90 | 56 | * `cortex_query_frontend_allow_multiple_replicas_on_same_node`
|
91 |
| -* [BUGFIX] Fixed `CortexIngesterHasNotShippedBlocks` alert false positive in case an ingester instance had ingested samples in the past, then no traffic was received for a long period and then it started receiving samples again. #308 |
92 |
| -* [BUGFIX] Alertmanager: fixed `--alertmanager.cluster.peers` CLI flag passed to alertmanager when HA is enabled. #329 |
93 |
| -* [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335 |
94 |
| -* [BUGFIX] Fixed scaling dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #365 |
95 |
| -* [BUGFIX] Fixed rollout progress dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #366 |
96 | 57 | * [BUGFIX] Fixed rollout progress dashboard to include query-scheduler too. #376
|
97 | 58 | * [BUGFIX] Fixed `-distributor.extend-writes` setting on ruler when `unregister_ingesters_on_shutdown` is disabled. #369
|
98 | 59 | * [BUGFIX] Upstream recording rule `node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate` renamed. #379
|
|
101 | 62 | * [BUGFIX] Span the annotation.message in alerts as YAML multiline strings. #412
|
102 | 63 | * [BUGFIX] Pass `-ruler-storage.s3.endpoint` to ruler when using S3. #421
|
103 | 64 |
|
| 65 | +## 1.10.0 / 2021-12-30 |
| 66 | + |
| 67 | +* [CHANGE] `namespace` template variable in dashboards now only selects namespaces for selected clusters. #311 |
| 68 | +* [CHANGE] Alertmanager: mounted overrides configmap to alertmanager too. #315 |
| 69 | +* [CHANGE] Memcached: upgraded memcached from `1.5.17` to `1.6.9`. #316 |
| 70 | +* [CHANGE] `CortexIngesterRestarts` alert severity changed from `critical` to `warning`. #321 |
| 71 | +* [CHANGE] Store-gateway: increased memory request and limit respectively from 6GB / 6GB to 12GB / 18GB. #322 |
| 72 | +* [CHANGE] Store-gateway: increased `-blocks-storage.bucket-store.max-chunk-pool-bytes` from 2GB (default) to 12GB. #322 |
| 73 | +* [CHANGE] Dashboards: added overridable `job_labels` and `cluster_labels` to the configuration object as label lists to uniquely identify jobs and clusters in the metric names and group-by lists in dashboards. #319 |
| 74 | +* [CHANGE] Dashboards: `alert_aggregation_labels` has been removed from the configuration and overriding this value has been deprecated. Instead the labels are now defined by the `cluster_labels` list, and should be overridden accordingly through that list. #319 |
| 75 | +* [CHANGE] Ingester/Ruler: set `-server.grpc-max-send-msg-size-bytes` and `-server.grpc-max-send-msg-size-bytes` to sensible default values (10MB). #326 |
| 76 | +* [CHANGE] Renamed `CortexCompactorHasNotUploadedBlocksSinceStart` to `CortexCompactorHasNotUploadedBlocks`. #334 |
| 77 | +* [CHANGE] Renamed `CortexCompactorRunFailed` to `CortexCompactorHasNotSuccessfullyRunCompaction`. #334 |
| 78 | +* [CHANGE] Renamed `CortexInconsistentConfig` alert to `CortexInconsistentRuntimeConfig` and increased severity to `critical`. #335 |
| 79 | +* [CHANGE] Increased `CortexBadRuntimeConfig` alert severity to `critical` and removed support for `cortex_overrides_last_reload_successful` metric (was removed in Cortex 1.3.0). #335 |
| 80 | +* [CHANGE] Grafana 'min step' changed to 15s so dashboard show better detail. #340 |
| 81 | +* [CHANGE] Replace `CortexRulerFailedEvaluations` with two new alerts: `CortexRulerTooManyFailedPushes` and `CortexRulerTooManyFailedQueries`. #347 |
| 82 | +* [CHANGE] Removed `CortexCacheRequestErrors` alert. This alert was not working because the legacy Cortex cache client instrumentation doesn't track errors. #346 |
| 83 | +* [CHANGE] Removed `CortexQuerierCapacityFull` alert. #342 |
| 84 | +* [CHANGE] Changes blocks storage alerts to group metrics by the configured `cluster_labels` (supporting the deprecated `alert_aggregation_labels`). #351 |
| 85 | +* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` critical alert threshold from 80% to 85%. #363 |
| 86 | +* [ENHANCEMENT] cortex-mixin: Make `cluster_namespace_deployment:kube_pod_container_resource_requests_{cpu_cores,memory_bytes}:sum` backwards compatible with `kube-state-metrics` v2.0.0. #317 |
| 87 | +* [ENHANCEMENT] Cortex-mixin: Include `cortex-gw-internal` naming variation in default `gateway` job names. #328 |
| 88 | +* [ENHANCEMENT] Ruler dashboard: added object storage metrics. #354 |
| 89 | +* [ENHANCEMENT] Alertmanager dashboard: added object storage metrics. #354 |
| 90 | +* [ENHANCEMENT] Added documentation text panels and descriptions to reads and writes dashboards. #324 |
| 91 | +* [ENHANCEMENT] Dashboards: defined container functions for common resources panels: containerDiskWritesPanel, containerDiskReadsPanel, containerDiskSpaceUtilization. #331 |
| 92 | +* [ENHANCEMENT] cortex-mixin: Added `alert_excluded_routes` config to exclude specific routes from alerts. #338 |
| 93 | +* [ENHANCEMENT] Added `CortexMemcachedRequestErrors` alert. #346 |
| 94 | +* [ENHANCEMENT] Ruler dashboard: added "Per route p99 latency" panel in the "Configuration API" row. #353 |
| 95 | +* [ENHANCEMENT] Increased the `for` duration of the `CortexIngesterReachingSeriesLimit` warning alert to 3h. #362 |
| 96 | +* [ENHANCEMENT] Added a new tier (`medium_small_user`) so we have another tier between 100K and 1Mil active series. #364 |
| 97 | +* [ENHANCEMENT] Extend Alertmanager dashboard: #313 |
| 98 | + * "Tenants" stat panel - shows number of discovered tenant configurations. |
| 99 | + * "Replication" row - information about the replication of tenants/alerts/silences over instances. |
| 100 | + * "Tenant Configuration Sync" row - information about the configuration sync procedure. |
| 101 | + * "Sharding Initial State Sync" row - information about the initial state sync procedure when sharding is enabled. |
| 102 | + * "Sharding Runtime State Sync" row - information about various state operations which occur when sharding is enabled (replication, fetch, marge, persist). |
| 103 | +* [BUGFIX] Fixed `CortexIngesterHasNotShippedBlocks` alert false positive in case an ingester instance had ingested samples in the past, then no traffic was received for a long period and then it started receiving samples again. #308 |
| 104 | +* [BUGFIX] Alertmanager: fixed `--alertmanager.cluster.peers` CLI flag passed to alertmanager when HA is enabled. #329 |
| 105 | +* [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335 |
| 106 | +* [BUGFIX] Fixed scaling dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #365 |
| 107 | +* [BUGFIX] Fixed rollout progress dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #366 |
| 108 | + |
104 | 109 | ## 1.9.0 / 2021-05-18
|
105 | 110 |
|
106 | 111 | * [CHANGE] Replace use of removed Cortex CLI flag `-querier.compress-http-responses` for query frontend with `-api.response-compression-enabled`. #299
|
|
0 commit comments