Skip to content

Commit 55597cf

Browse files
authored
[AWS Master] Kubernetes: add logging stack (#1063)
* Add dir for grafana loki chart * Switch from loki to ELK stack * Longhorn Readme. Fix typo * Further configuration * Final draft ELK configuration * add victoria logs * Introduce victoria logs and auth * vl ha configuration with 2 charts + 1 vmauth chart * Converge on a single replicated victoria logs chart * Remove elastic stack chart * Remove vector chart (already included in victora logs chart) * Fix gui trailing slash issue with a href
1 parent a96bb9e commit 55597cf

File tree

7 files changed

+86
-4
lines changed

7 files changed

+86
-4
lines changed

charts/aws-ebs-csi-driver/values.yaml.gotmpl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ image:
55
tag: "v1.38.1"
66

77
storageClasses:
8-
- name: "ebs-sc"
8+
- name: "{{ .Values.ebsStorageClassName }}"
99
parameters:
1010
type: "gp3"
1111
allowVolumeExpansion: true

charts/longhorn/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
### Can LH be used for critical services (e.g., Databases)?
44

5-
No (as of now). , we should not use it for volumes of critical services.
5+
No. We should not use it for volumes of critical services.
66

77
As of now, we should avoid using LH for critical services. Instead, we should rely on easier-to-maintain solutions (e.g., application-level replication [Postgres Operators], S3, etc.). Once we get hands-on experience, extensive monitoring and ability to scale LH, we can consider using it for critical services.
88

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "1Gi" # minimal size for gp3 is 1Gi
4-
storageClass: "ebs-sc"
4+
storageClass: "{{ .Values.ebsStorageClassName }}"
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "300Mi" # cannot be lower https://github.com/longhorn/longhorn/issues/8488
4-
storageClass: "{{.Values.longhornStorageClassName}}"
4+
storageClass: "{{ .Values.longhornStorageClassName }}"

charts/traefik/values.secure.yaml.gotmpl

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,27 @@ extraObjects:
6060
prefixes:
6161
- /longhorn
6262

63+
# a (href) links do not work properly without trailing slash
64+
- apiVersion: traefik.io/v1alpha1
65+
kind: Middleware
66+
metadata:
67+
name: logs-append-slash
68+
namespace: {{ .Release.Namespace }}
69+
spec:
70+
redirectRegex:
71+
regex: "^(https?://[^/]+/logs)$"
72+
replacement: "${1}/"
73+
74+
- apiVersion: traefik.io/v1alpha1
75+
kind: Middleware
76+
metadata:
77+
name: logs-strip-prefix
78+
namespace: {{.Release.Namespace}}
79+
spec:
80+
stripPrefix:
81+
prefixes:
82+
- /logs
83+
6384
- apiVersion: traefik.io/v1alpha1
6485
kind: Middleware
6586
metadata:

charts/victoria-logs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Highly Available Configuration with Helm:
2+
* https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9076
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# https://github.com/VictoriaMetrics/helm-charts/blob/victoria-logs-single-0.11.2/charts/victoria-logs-single/values.yaml
2+
3+
vector:
4+
# by default it will generate sink per statefulset's pod
5+
# each pod has a separate PV, so the data is replicated
6+
enabled: true
7+
8+
server:
9+
# HA trough multiple replicas
10+
# https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9076
11+
replicaCount: 2
12+
13+
retentionPeriod: 30d
14+
15+
ingress:
16+
enabled: true
17+
annotations:
18+
namespace: "{{ .Release.Namespace }}"
19+
cert-manager.io/cluster-issuer: "cert-issuer"
20+
traefik.ingress.kubernetes.io/router.entrypoints: websecure
21+
traefik.ingress.kubernetes.io/router.middlewares: traefik-logs-append-slash@kubernetescrd,traefik-logs-strip-prefix@kubernetescrd,traefik-traefik-basic-auth@kubernetescrd # namespace + middleware name
22+
tls:
23+
- hosts:
24+
- {{ requiredEnv "K8S_MONITORING_FQDN" }}
25+
secretName: monitoring-tls
26+
hosts:
27+
- name: {{ requiredEnv "K8S_MONITORING_FQDN" }}
28+
path:
29+
- /logs
30+
pathType: Prefix
31+
32+
persistentVolume:
33+
enabled: true
34+
storageClassName: "{{ .Values.ebsStorageClassName }}"
35+
size: 10Gi
36+
37+
nodeSelector:
38+
ops: "true"
39+
40+
# Schedule pods on different nodes if possible (HA)
41+
# https://stackoverflow.com/a/64958458/12124525
42+
topologySpreadConstraints:
43+
- maxSkew: 1
44+
topologyKey: "kubernetes.io/hostname"
45+
whenUnsatisfiable: DoNotSchedule
46+
# hardcoded due to https://github.com/VictoriaMetrics/helm-charts/issues/2219
47+
labelSelector:
48+
matchLabels:
49+
app: server
50+
app.kubernetes.io/instance: victoria-logs
51+
app.kubernetes.io/name: victoria-logs-single
52+
53+
resources:
54+
limits:
55+
cpu: 500m
56+
memory: 512Mi
57+
requests:
58+
cpu: 500m
59+
memory: 512Mi

0 commit comments

Comments
 (0)