Skip to content

Commit 34a86fd

Browse files
committed
Merge remote-tracking branch 'upstream/main'
2 parents c0f393e + 04037f6 commit 34a86fd

31 files changed

+435
-136
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,16 @@
1818
- [ ] Service has placement constraints or is global
1919
- [ ] Service is restartable
2020
- [ ] Service restart is zero-downtime
21+
- [ ] Service has >1 replicas in PROD
22+
- [ ] Service has docker heathlcheck enabled
2123
- [ ] Service is monitored (via prometheus and grafana)
2224
- [ ] Service is not bound to one specific node (e.g. via files or volumes)
2325
- [ ] Relevant OPS E2E Test are added
26+
27+
If exposed via traefik
2428
- [ ] Service's Public URL is included in maintenance mode
25-
- [ ] Service's Public URL is included in testing mode -->
29+
- [ ] Service's Public URL is included in testing mode
30+
- [ ] Service's has Traefik (Service Loadbalancer) Healthcheck enabled
31+
- [ ] Credentials page is updated
32+
- [ ] Url added to e2e test services (e2e test checking that URL can be accessed)
33+
-->

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ yq
142142
**/.env-devel
143143
**/.stack.*.yml
144144
**/.stack.*.yaml
145-
docker-compose.yml
145+
146146
stack.yml
147147
stack_with_prefix.yml
148148
docker-compose.simcore.yml
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
## How to delete volumes with `recalimPolicy: retain`
2+
1. Delete pvc:
3+
```
4+
kubectl delete pvc <pvc-name>
5+
```
6+
7+
2. Verify PV is `released`
8+
```
9+
kubectl get pv <pv-name>
10+
```
11+
12+
3. Manually remove EBS in AWS
13+
1. Go to AWS GUI and List EBS Volumes
14+
1. Filter by tag `ebs.csi.aws.com/cluster=true`
15+
1. Identify the volume associated with your PV (check `kubernetes.io/created-for/pv/name` tag of the EBS Volume)
16+
1. Verify that EBS Volume is `Available`
17+
1. Delete EBS Volume
18+
19+
4. Delete the PV
20+
```
21+
kubectl delete pv <pv-name>
22+
```
23+
24+
5. Remove Finalizers (if necessary)
25+
If the PV remains in a Terminating state, remove its finalizers:
26+
```
27+
kubectl patch pv <pv-name> -p '{"metadata":{"finalizers":null}}'
28+
```

charts/aws-ebs-csi-driver/values.yaml.gotmpl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ image:
55
tag: "v1.38.1"
66

77
storageClasses:
8-
- name: "ebs-sc"
8+
- name: "{{ .Values.ebsStorageClassName }}"
99
parameters:
1010
type: "gp3"
1111
allowVolumeExpansion: true

charts/longhorn/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
### Can LH be used for critical services (e.g., Databases)?
44

5-
No (as of now). , we should not use it for volumes of critical services.
5+
No. We should not use it for volumes of critical services.
66

77
As of now, we should avoid using LH for critical services. Instead, we should rely on easier-to-maintain solutions (e.g., application-level replication [Postgres Operators], S3, etc.). Once we get hands-on experience, extensive monitoring and ability to scale LH, we can consider using it for critical services.
88

@@ -25,6 +25,12 @@ Source:
2525
* https://longhorn.io/kb/tip-only-use-storage-on-a-set-of-nodes/
2626
* https://longhorn.io/docs/1.8.1/nodes-and-volumes/nodes/default-disk-and-node-config/#customizing-default-disks-for-new-nodes
2727

28+
### How to configure disks for LH
29+
30+
As of now, we follow the same approach we use for `/docker` folder (via ansible playbook) but we use `/longhorn` folder name
31+
32+
Issue asking LH to clearly document requirements: https://github.com/longhorn/longhorn/issues/11125
33+
2834
### Can workloads be run on nodes where LH is not installed?
2935

3036
Workloads can run on nodes without LH as long as LH is not restricted to specific nodes via the `nodeSelector` or `systemManagedComponentsNodeSelector` settings. If LH is configured to run on specific nodes, workloads can only run on those nodes.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "1Gi" # minimal size for gp3 is 1Gi
4-
storageClass: "ebs-sc"
4+
storageClass: "{{ .Values.ebsStorageClassName }}"
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "300Mi" # cannot be lower https://github.com/longhorn/longhorn/issues/8488
4-
storageClass: "{{.Values.longhornStorageClassName}}"
4+
storageClass: "{{ .Values.longhornStorageClassName }}"

charts/traefik/values.secure.yaml.gotmpl

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,27 @@ extraObjects:
6060
prefixes:
6161
- /longhorn
6262

63+
# a (href) links do not work properly without trailing slash
64+
- apiVersion: traefik.io/v1alpha1
65+
kind: Middleware
66+
metadata:
67+
name: logs-append-slash
68+
namespace: {{ .Release.Namespace }}
69+
spec:
70+
redirectRegex:
71+
regex: "^(https?://[^/]+/logs)$"
72+
replacement: "${1}/"
73+
74+
- apiVersion: traefik.io/v1alpha1
75+
kind: Middleware
76+
metadata:
77+
name: logs-strip-prefix
78+
namespace: {{.Release.Namespace}}
79+
spec:
80+
stripPrefix:
81+
prefixes:
82+
- /logs
83+
6384
- apiVersion: traefik.io/v1alpha1
6485
kind: Middleware
6586
metadata:
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# https://github.com/VictoriaMetrics/helm-charts/blob/victoria-logs-single-0.11.2/charts/victoria-logs-single/values.yaml
2+
3+
vector:
4+
# by default it will generate sink per statefulset's pod
5+
# each pod has a separate PV, so the data is replicated
6+
enabled: true
7+
8+
server:
9+
# HA trough multiple replicas
10+
# https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9076
11+
replicaCount: 2
12+
13+
retentionPeriod: 30d
14+
15+
ingress:
16+
enabled: true
17+
annotations:
18+
namespace: "{{ .Release.Namespace }}"
19+
cert-manager.io/cluster-issuer: "cert-issuer"
20+
traefik.ingress.kubernetes.io/router.entrypoints: websecure
21+
traefik.ingress.kubernetes.io/router.middlewares: traefik-logs-append-slash@kubernetescrd,traefik-logs-strip-prefix@kubernetescrd,traefik-traefik-basic-auth@kubernetescrd # namespace + middleware name
22+
tls:
23+
- hosts:
24+
- {{ requiredEnv "K8S_MONITORING_FQDN" }}
25+
secretName: monitoring-tls
26+
hosts:
27+
- name: {{ requiredEnv "K8S_MONITORING_FQDN" }}
28+
path:
29+
- /logs
30+
pathType: Prefix
31+
32+
persistentVolume:
33+
enabled: true
34+
storageClassName: "{{ .Values.ebsStorageClassName }}"
35+
size: 10Gi
36+
37+
nodeSelector:
38+
ops: "true"
39+
40+
# Schedule pods on different nodes if possible (HA)
41+
# https://stackoverflow.com/a/64958458/12124525
42+
topologySpreadConstraints:
43+
- maxSkew: 1
44+
topologyKey: "kubernetes.io/hostname"
45+
whenUnsatisfiable: DoNotSchedule
46+
# hardcoded due to https://github.com/VictoriaMetrics/helm-charts/issues/2219
47+
labelSelector:
48+
matchLabels:
49+
app: server
50+
app.kubernetes.io/instance: victoria-logs
51+
app.kubernetes.io/name: victoria-logs-single
52+
53+
resources:
54+
limits:
55+
cpu: 500m
56+
memory: 512Mi
57+
requests:
58+
cpu: 500m
59+
memory: 512Mi

scripts/common-services.Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
STACK_NAME = $(notdir $(shell pwd))
2+
TEMP_COMPOSE=.stack.${STACK_NAME}.yaml
3+
REPO_BASE_DIR := $(shell git rev-parse --show-toplevel)

0 commit comments

Comments
 (0)