Skip to content

Commit 8b5bcfc

Browse files
[tempo-distributed] added more autoscaling configurations to Tempo components
1 parent 8ba262e commit 8b5bcfc

File tree

7 files changed

+207
-4
lines changed

7 files changed

+207
-4
lines changed

charts/tempo-distributed/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ apiVersion: v2
22
name: tempo-distributed
33
description: Grafana Tempo in MicroService mode
44
type: application
5-
version: 1.47.4
5+
version: 1.48.0
66
appVersion: 2.8.2
77
engine: gotpl
88
home: https://grafana.com/docs/tempo/latest/

charts/tempo-distributed/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,6 +361,8 @@ The memcached default args are removed and should be provided manually. The sett
361361
| distributor.appProtocol.grpc | string | `nil` | Set the optional grpc service protocol. Ex: "grpc", "http2" or "https" |
362362
| distributor.autoscaling.behavior | object | `{}` | Autoscaling behavior configuration for the distributor |
363363
| distributor.autoscaling.enabled | bool | `false` | Enable autoscaling for the distributor |
364+
| distributor.autoscaling.keda | object | `{"enabled":false,"triggers":[]}` | Autoscaling via keda/ScaledObject |
365+
| distributor.autoscaling.keda.triggers | list | `[]` | List of autoscaling triggers for the distributor |
364366
| distributor.autoscaling.maxReplicas | int | `3` | Maximum autoscaling replicas for the distributor |
365367
| distributor.autoscaling.minReplicas | int | `1` | Minimum autoscaling replicas for the distributor |
366368
| distributor.autoscaling.targetCPUUtilizationPercentage | int | `60` | Target CPU utilisation percentage for the distributor |
@@ -573,6 +575,8 @@ The memcached default args are removed and should be provided manually. The sett
573575
| ingester.appProtocol.grpc | string | `nil` | Set the optional grpc service protocol. Ex: "grpc", "http2" or "https" |
574576
| ingester.autoscaling.behavior | object | `{}` | Autoscaling behavior configuration for the ingester |
575577
| ingester.autoscaling.enabled | bool | `false` | Enable autoscaling for the ingester. WARNING: Autoscaling ingesters can result in lost data. Only do this if you know what you're doing. |
578+
| ingester.autoscaling.keda | object | `{"enabled":false,"triggers":[]}` | Autoscaling via keda/ScaledObject |
579+
| ingester.autoscaling.keda.triggers | list | `[]` | List of autoscaling triggers for the ingester |
576580
| ingester.autoscaling.maxReplicas | int | `3` | Maximum autoscaling replicas for the ingester |
577581
| ingester.autoscaling.minReplicas | int | `2` | Minimum autoscaling replicas for the ingester |
578582
| ingester.autoscaling.targetCPUUtilizationPercentage | int | `60` | Target CPU utilisation percentage for the ingester |
@@ -723,6 +727,14 @@ The memcached default args are removed and should be provided manually. The sett
723727
| metricsGenerator.annotations | object | `{}` | Annotations for the metrics-generator StatefulSet |
724728
| metricsGenerator.appProtocol | object | `{"grpc":null}` | Adds the appProtocol field to the metricsGenerator service. This allows metricsGenerator to work with istio protocol selection. |
725729
| metricsGenerator.appProtocol.grpc | string | `nil` | Set the optional grpc service protocol. Ex: "grpc", "http2" or "https" |
730+
| metricsGenerator.autoscaling.behavior | object | `{}` | Autoscaling behavior configuration for the metrics-generator |
731+
| metricsGenerator.autoscaling.enabled | bool | `false` | Scaling down metrics-generators can cause backpressure on the distributor. |
732+
| metricsGenerator.autoscaling.keda | object | `{"enabled":false,"triggers":[]}` | Autoscaling via keda/ScaledObject |
733+
| metricsGenerator.autoscaling.keda.triggers | list | `[]` | List of autoscaling triggers for the metrics-generator |
734+
| metricsGenerator.autoscaling.maxReplicas | int | `3` | Maximum autoscaling replicas for the metrics-generator |
735+
| metricsGenerator.autoscaling.minReplicas | int | `2` | Minimum autoscaling replicas for the metrics-generator |
736+
| metricsGenerator.autoscaling.targetCPUUtilizationPercentage | int | `60` | Target CPU utilisation percentage for the metrics-generator |
737+
| metricsGenerator.autoscaling.targetMemoryUtilizationPercentage | string | `nil` | Target memory utilisation percentage for the metrics-generator |
726738
| metricsGenerator.config | object | `{"metrics_ingestion_time_range_slack":"30s","processor":{"service_graphs":{"dimensions":[],"histogram_buckets":[0.1,0.2,0.4,0.8,1.6,3.2,6.4,12.8],"max_items":10000,"wait":"10s","workers":10},"span_metrics":{"dimensions":[],"histogram_buckets":[0.002,0.004,0.008,0.016,0.032,0.064,0.128,0.256,0.512,1.02,2.05,4.1]}},"registry":{"collection_interval":"15s","external_labels":{},"stale_duration":"15m"},"storage":{"path":"/var/tempo/wal","remote_write":[],"remote_write_add_org_id_header":true,"remote_write_flush_deadline":"1m","wal":null},"traces_storage":{"path":"/var/tempo/traces"}}` | More information on configuration: https://grafana.com/docs/tempo/latest/configuration/#metrics-generator |
727739
| metricsGenerator.config.processor.service_graphs | object | `{"dimensions":[],"histogram_buckets":[0.1,0.2,0.4,0.8,1.6,3.2,6.4,12.8],"max_items":10000,"wait":"10s","workers":10}` | For processors to be enabled and generate metrics, pass the names of the processors to `overrides.defaults.metrics_generator.processors` value like `[service-graphs, span-metrics]`. |
728740
| metricsGenerator.config.processor.service_graphs.dimensions | list | `[]` | The resource and span attributes to be added to the service graph metrics, if present. |
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{{- if and .Values.distributor.autoscaling.enabled .Values.distributor.autoscaling.keda.enabled }}
2+
apiVersion: keda.sh/v1alpha1
3+
kind: ScaledObject
4+
metadata:
5+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "distributor") }}
6+
namespace: {{ .Release.Namespace }}
7+
labels:
8+
{{- include "tempo.labels" (dict "ctx" . "component" "distributor") | nindent 4 }}
9+
spec:
10+
minReplicaCount: {{ .Values.distributor.autoscaling.minReplicas }}
11+
maxReplicaCount: {{ .Values.distributor.autoscaling.maxReplicas }}
12+
scaleTargetRef:
13+
apiVersion: apps/v1
14+
kind: Deployment
15+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "distributor") }}
16+
triggers:
17+
{{- range .Values.distributor.autoscaling.keda.triggers }}
18+
- type: {{ .type | quote }}
19+
metadata:
20+
serverAddress: {{ .metadata.serverAddress }}
21+
threshold: {{ .metadata.threshold | quote }}
22+
query: |
23+
{{- .metadata.query | nindent 8 }}
24+
{{- if .metadata.customHeaders }}
25+
customHeaders: {{ .metadata.customHeaders }}
26+
{{- end }}
27+
{{- end }}
28+
{{- end }}
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{{- if and .Values.ingester.autoscaling.enabled .Values.ingester.autoscaling.keda.enabled }}
2+
apiVersion: keda.sh/v1alpha1
3+
kind: ScaledObject
4+
metadata:
5+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "ingester") }}
6+
namespace: {{ .Release.Namespace }}
7+
labels:
8+
{{- include "tempo.labels" (dict "ctx" . "component" "ingester") | nindent 4 }}
9+
spec:
10+
minReplicaCount: {{ .Values.ingester.autoscaling.minReplicas }}
11+
maxReplicaCount: {{ .Values.ingester.autoscaling.maxReplicas }}
12+
scaleTargetRef:
13+
apiVersion: apps/v1
14+
kind: Deployment
15+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "ingester") }}
16+
triggers:
17+
{{- range .Values.ingester.autoscaling.keda.triggers }}
18+
- type: {{ .type | quote }}
19+
metadata:
20+
serverAddress: {{ .metadata.serverAddress }}
21+
threshold: {{ .metadata.threshold | quote }}
22+
query: |
23+
{{- .metadata.query | nindent 8 }}
24+
{{- if .metadata.customHeaders }}
25+
customHeaders: {{ .metadata.customHeaders }}
26+
{{- end }}
27+
{{- end }}
28+
{{- end }}
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{{- if .Values.metricsGenerator.autoscaling.enabled }}
2+
{{- $apiVersion := include "tempo.hpa.apiVersion" . -}}
3+
apiVersion: {{ $apiVersion }}
4+
kind: HorizontalPodAutoscaler
5+
metadata:
6+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "metricsGenerator") }}
7+
namespace: {{ .Release.Namespace }}
8+
labels:
9+
{{- include "tempo.labels" (dict "ctx" . "component" "metricsGenerator") | nindent 4 }}
10+
spec:
11+
scaleTargetRef:
12+
apiVersion: apps/v1
13+
kind: StatefulSet
14+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "metricsGenerator") }}
15+
minReplicas: {{ .Values.metricsGenerator.autoscaling.minReplicas }}
16+
maxReplicas: {{ .Values.metricsGenerator.autoscaling.maxReplicas }}
17+
{{- with .Values.metricsGenerator.autoscaling.behavior }}
18+
behavior:
19+
{{- toYaml . | nindent 4 }}
20+
{{- end }}
21+
metrics:
22+
{{- with .Values.metricsGenerator.autoscaling.targetMemoryUtilizationPercentage }}
23+
- type: Resource
24+
resource:
25+
name: memory
26+
{{- if (eq $apiVersion "autoscaling/v2") }}
27+
target:
28+
type: Utilization
29+
averageUtilization: {{ . }}
30+
{{- else }}
31+
targetAverageUtilization: {{ . }}
32+
{{- end }}
33+
{{- end }}
34+
{{- with .Values.metricsGenerator.autoscaling.targetCPUUtilizationPercentage }}
35+
- type: Resource
36+
resource:
37+
name: cpu
38+
{{- if (eq $apiVersion "autoscaling/v2") }}
39+
target:
40+
type: Utilization
41+
averageUtilization: {{ . }}
42+
{{- else }}
43+
targetAverageUtilization: {{ . }}
44+
{{- end }}
45+
{{- end }}
46+
{{- end }}
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{{- if and .Values.metricsGenerator.autoscaling.enabled .Values.metricsGenerator.autoscaling.keda.enabled }}
2+
apiVersion: keda.sh/v1alpha1
3+
kind: ScaledObject
4+
metadata:
5+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "metricsGenerator") }}
6+
namespace: {{ .Release.Namespace }}
7+
labels:
8+
{{- include "tempo.labels" (dict "ctx" . "component" "metricsGenerator") | nindent 4 }}
9+
spec:
10+
minReplicaCount: {{ .Values.metricsGenerator.autoscaling.minReplicas }}
11+
maxReplicaCount: {{ .Values.metricsGenerator.autoscaling.maxReplicas }}
12+
scaleTargetRef:
13+
apiVersion: apps/v1
14+
kind: Deployment
15+
name: {{ include "tempo.resourceName" (dict "ctx" . "component" "metricsGenerator") }}
16+
triggers:
17+
{{- range .Values.metricsGenerator.autoscaling.keda.triggers }}
18+
- type: {{ .type | quote }}
19+
metadata:
20+
serverAddress: {{ .metadata.serverAddress }}
21+
threshold: {{ .metadata.threshold | quote }}
22+
query: |
23+
{{- .metadata.query | nindent 8 }}
24+
{{- if .metadata.customHeaders }}
25+
customHeaders: {{ .metadata.customHeaders }}
26+
{{- end }}
27+
{{- end }}
28+
{{- end }}

charts/tempo-distributed/values.yaml

Lines changed: 64 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,22 @@ ingester:
150150
targetCPUUtilizationPercentage: 60
151151
# -- Target memory utilisation percentage for the ingester
152152
targetMemoryUtilizationPercentage:
153+
# -- Autoscaling via keda/ScaledObject
154+
keda:
155+
# requires https://keda.sh/
156+
enabled: false
157+
# -- List of autoscaling triggers for the ingester
158+
triggers: []
159+
# - type: prometheus
160+
# metadata:
161+
# serverAddress: "http://<prometheus-host>:9090"
162+
# threshold: "<set to a value below your rate limit>"
163+
# -- KEDA autoscaling will automatically average out the value received by the number of replicas.
164+
# query: |-
165+
# sum(
166+
# tempo_ingester_traces_created_total{cluster=\"$cluster\", namespace=\"$namespace\"}
167+
# ) by (name)
168+
# customHeaders: X-Scope-OrgID=<tenant-id>
153169
image:
154170
# -- The Docker registry for the ingester image. Overrides `tempo.image.registry`
155171
registry: null
@@ -337,6 +353,37 @@ metricsGenerator:
337353
# - domain.tld
338354
# -- Init containers for the metrics generator pod
339355
initContainers: []
356+
autoscaling:
357+
# -- Enable autoscaling for the metrics-generator. WARNING: Autoscaling metrics-generators can result in lost data. Only do this if you know what you're doing.
358+
# -- Scaling down metrics-generators can cause backpressure on the distributor.
359+
enabled: false
360+
# -- Minimum autoscaling replicas for the metrics-generator
361+
minReplicas: 2
362+
# -- Maximum autoscaling replicas for the metrics-generator
363+
maxReplicas: 3
364+
# -- Autoscaling behavior configuration for the metrics-generator
365+
behavior: {}
366+
# -- Target CPU utilisation percentage for the metrics-generator
367+
targetCPUUtilizationPercentage: 60
368+
# -- Target memory utilisation percentage for the metrics-generator
369+
targetMemoryUtilizationPercentage:
370+
# -- Autoscaling via keda/ScaledObject
371+
keda:
372+
# requires https://keda.sh/
373+
enabled: false
374+
# -- List of autoscaling triggers for the metrics-generator
375+
triggers: []
376+
# - type: prometheus
377+
# metadata:
378+
# serverAddress: "http://<prometheus-host>:9090"
379+
# threshold: "<set to a value below your rate limit>"
380+
# -- KEDA autoscaling will automatically average out the value received by the number of replicas.
381+
# -- This example scales on distributor queue length to alleviate backpressure.
382+
# query: |-
383+
# sum(
384+
# tempo_distributor_queue_length{namespace=~".*"}
385+
# ) by (name)
386+
# customHeaders: X-Scope-OrgID=<tenant-id>
340387
image:
341388
# -- The Docker registry for the metrics-generator image. Overrides `tempo.image.registry`
342389
registry: null
@@ -506,6 +553,22 @@ distributor:
506553
targetCPUUtilizationPercentage: 60
507554
# -- Target memory utilisation percentage for the distributor
508555
targetMemoryUtilizationPercentage:
556+
# -- Autoscaling via keda/ScaledObject
557+
keda:
558+
# requires https://keda.sh/
559+
enabled: false
560+
# -- List of autoscaling triggers for the distributor
561+
triggers: []
562+
# - type: prometheus
563+
# metadata:
564+
# serverAddress: "http://<prometheus-host>:9090"
565+
# threshold: "<set to a value below your rate limit>"
566+
# -- KEDA autoscaling will automatically average out the value received by the number of replicas.
567+
# query: |-
568+
# sum by(cluster) (
569+
# rate(tempo_distributor_spans_received_total{namespace=~".*"})
570+
# )
571+
# customHeaders: X-Scope-OrgID=<tenant-id>
509572
image:
510573
# -- The Docker registry for the distributor image. Overrides `tempo.image.registry`
511574
registry: null
@@ -665,12 +728,10 @@ compactor:
665728
# metadata:
666729
# serverAddress: "http://<prometheus-host>:9090"
667730
# threshold: "250"
731+
# -- KEDA autoscaling will automatically average out the value received by the number of replicas.
668732
# query: |-
669733
# sum by (cluster, namespace, tenant) (
670734
# tempodb_compaction_outstanding_blocks{container="compactor", namespace=~".*"}
671-
# ) /
672-
# ignoring(tenant) group_left count by (cluster, namespace)(
673-
# tempo_build_info{container="compactor", namespace=~".*"}
674735
# )
675736
# customHeaders: X-Scope-OrgID=<tenant-id>
676737

0 commit comments

Comments
 (0)