Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit f0ed263

Browse files
authored
Merge pull request #369 from grafana/config-changes
Improve config settings based on recent learnings
2 parents ee591ee + 7372c4c commit f0ed263

File tree

5 files changed

+14
-2
lines changed

5 files changed

+14
-2
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@
2121
* [CHANGE] Removed `CortexQuerierCapacityFull` alert. #342
2222
* [CHANGE] Changes blocks storage alerts to group metrics by the configured `cluster_labels` (supporting the deprecated `alert_aggregation_labels`). #351
2323
* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` critical alert threshold from 80% to 85%. #363
24+
* [CHANGE] Decreased `-server.grpc-max-concurrent-streams` from 100k to 10k. #369
25+
* [CHANGE] Decreased blocks storage ingesters graceful termination period from 80m to 20m. #369
2426
* [ENHANCEMENT] cortex-mixin: Make `cluster_namespace_deployment:kube_pod_container_resource_requests_{cpu_cores,memory_bytes}:sum` backwards compatible with `kube-state-metrics` v2.0.0. #317
2527
* [ENHANCEMENT] Cortex-mixin: Include `cortex-gw-internal` naming variation in default `gateway` job names. #328
2628
* [ENHANCEMENT] Ruler dashboard: added object storage metrics. #354
@@ -38,11 +40,13 @@
3840
* "Tenant Configuration Sync" row - information about the configuration sync procedure.
3941
* "Sharding Initial State Sync" row - information about the initial state sync procedure when sharding is enabled.
4042
* "Sharding Runtime State Sync" row - information about various state operations which occur when sharding is enabled (replication, fetch, marge, persist).
43+
* [ENHANCEMENT] Added 256MB memory ballast to querier. #369
4144
* [BUGFIX] Fixed `CortexIngesterHasNotShippedBlocks` alert false positive in case an ingester instance had ingested samples in the past, then no traffic was received for a long period and then it started receiving samples again. #308
4245
* [BUGFIX] Alertmanager: fixed `--alertmanager.cluster.peers` CLI flag passed to alertmanager when HA is enabled. #329
4346
* [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335
4447
* [BUGFIX] Fixed scaling dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #365
4548
* [BUGFIX] Fixed rollout progress dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #366
49+
* [BUGFIX] Fixed `-distributor.extend-writes` setting on ruler when `unregister_ingesters_on_shutdown` is disabled. #369
4650

4751
## 1.9.0 / 2021-05-18
4852

cortex/ingester.libsonnet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
'ingester.max-series-per-query': $._config.limits.max_series_per_query,
3232
'ingester.max-samples-per-query': $._config.limits.max_samples_per_query,
3333
'runtime-config.file': '/etc/cortex/overrides.yaml',
34-
'server.grpc-max-concurrent-streams': 100000,
34+
'server.grpc-max-concurrent-streams': 10000,
3535
'server.grpc-max-send-msg-size-bytes': 10 * 1024 * 1024,
3636
'server.grpc-max-recv-msg-size-bytes': 10 * 1024 * 1024,
3737
} + (

cortex/querier.libsonnet

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@
2626

2727
'querier.second-store-engine': $._config.querier_second_storage_engine,
2828

29+
// We request high memory but the Go heap is typically very low (< 100MB) and this causes
30+
// the GC to trigger continuously. Setting a ballast of 256MB reduces GC.
31+
'mem-ballast-size-bytes': 1 << 28, // 256M
32+
2933
'log.level': 'debug',
3034
},
3135

cortex/ruler.libsonnet

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,10 @@
2929

3030
// Storage
3131
'querier.second-store-engine': $._config.querier_second_storage_engine,
32+
33+
// Do not extend the replication set on unhealthy (or LEAVING) ingester when "unregister on shutdown"
34+
// is set to false.
35+
'distributor.extend-writes': $._config.unregister_ingesters_on_shutdown,
3236
},
3337

3438
ruler_container::

cortex/tsdb.libsonnet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@
115115
statefulSet.mixin.spec.template.spec.securityContext.withRunAsUser(0) +
116116
// When the ingester needs to flush blocks to the storage, it may take quite a lot of time.
117117
// For this reason, we grant an high termination period (80 minutes).
118-
statefulSet.mixin.spec.template.spec.withTerminationGracePeriodSeconds(4800) +
118+
statefulSet.mixin.spec.template.spec.withTerminationGracePeriodSeconds(1200) +
119119
statefulSet.mixin.spec.updateStrategy.withType('RollingUpdate') +
120120
$.util.configVolumeMount($._config.overrides_configmap, '/etc/cortex') +
121121
$.util.podPriority('high') +

0 commit comments

Comments
 (0)