Skip to content

Commit cae2c20

Browse files
committed
Merge remote-tracking branch 'upstream/main' into 2025/add/fluentd
2 parents 88e4ed5 + 83a544a commit cae2c20

File tree

85 files changed

+1208
-638
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

85 files changed

+1208
-638
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,16 @@
1818
- [ ] Service has placement constraints or is global
1919
- [ ] Service is restartable
2020
- [ ] Service restart is zero-downtime
21+
- [ ] Service has >1 replicas in PROD
22+
- [ ] Service has docker heathlcheck enabled
2123
- [ ] Service is monitored (via prometheus and grafana)
2224
- [ ] Service is not bound to one specific node (e.g. via files or volumes)
2325
- [ ] Relevant OPS E2E Test are added
26+
27+
If exposed via traefik
2428
- [ ] Service's Public URL is included in maintenance mode
25-
- [ ] Service's Public URL is included in testing mode -->
29+
- [ ] Service's Public URL is included in testing mode
30+
- [ ] Service's has Traefik (Service Loadbalancer) Healthcheck enabled
31+
- [ ] Credentials page is updated
32+
- [ ] Url added to e2e test services (e2e test checking that URL can be accessed)
33+
-->

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ yq
142142
**/.env-devel
143143
**/.stack.*.yml
144144
**/.stack.*.yaml
145-
docker-compose.yml
145+
146146
stack.yml
147147
stack_with_prefix.yml
148148
docker-compose.simcore.yml

.pre-commit-config.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,10 @@ repos:
122122
always_run: true
123123
language: script
124124
files: '^(.*\/Makefile.*)|(.*\.deploy_everything_locally.bash)|(.*\/services/.*\/.*\.((sh)|(bash)))$'
125+
- id: helm-update-dependencies
126+
name: Helm Dependency Update
127+
description: Make sure all Chart.lock files are up-to-date
128+
entry: bash -c 'find . -name Chart.yaml -exec dirname {} \; | xargs -t -I% helm dependency update %'
129+
language: system
130+
files: ^charts/
131+
pass_filenames: false

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ certificates/domain.key:
2626
# Done: Creating docker secrets
2727

2828
.PHONY: up-local
29-
up-local: .init .venv .install-fqdn certificates/domain.crt certificates/domain.key .create-secrets ## deploy osparc ops stacks and simcore, use minio_disabled=1 if minio s3 should not be started (if you have custom S3 set up)
29+
up-local: .init venv .install-fqdn certificates/domain.crt certificates/domain.key .create-secrets ## deploy osparc ops stacks and simcore, use minio_disabled=1 if minio s3 should not be started (if you have custom S3 set up)
3030
@bash scripts/deployments/deploy_everything_locally.bash --stack_target=local --minio_enabled=0 --vcs_check=1
3131
@$(MAKE) info-local
3232

charts/SECURITY.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Security
2+
3+
This file documents security measures and their configuration in current code base
4+
5+
## Application developer
6+
7+
Full list: https://kubernetes.io/docs/concepts/security/application-security-checklist/
8+
9+
#### Pod-level securityContext recommendations
10+
11+
Enable pod security standard on namespace level:
12+
* create namespace with labels (examples and explanations https://aro-labs.com/pod-security-standards/)
13+
* configure pod and container security context to satisfy security standards (read more https://medium.com/dynatrace-engineering/kubernetes-security-part-3-security-context-7d44862c4cfa)
14+
15+
## Cluster / OPS developers
16+
17+
Full list: https://kubernetes.io/docs/concepts/security/security-checklist/
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
## How to delete volumes with `recalimPolicy: retain`
2+
1. Delete pvc:
3+
```
4+
kubectl delete pvc <pvc-name>
5+
```
6+
7+
2. Verify PV is `released`
8+
```
9+
kubectl get pv <pv-name>
10+
```
11+
12+
3. Manually remove EBS in AWS
13+
1. Go to AWS GUI and List EBS Volumes
14+
1. Filter by tag `ebs.csi.aws.com/cluster=true`
15+
1. Identify the volume associated with your PV (check `kubernetes.io/created-for/pv/name` tag of the EBS Volume)
16+
1. Verify that EBS Volume is `Available`
17+
1. Delete EBS Volume
18+
19+
4. Delete the PV
20+
```
21+
kubectl delete pv <pv-name>
22+
```
23+
24+
5. Remove Finalizers (if necessary)
25+
If the PV remains in a Terminating state, remove its finalizers:
26+
```
27+
kubectl patch pv <pv-name> -p '{"metadata":{"finalizers":null}}'
28+
```

charts/aws-ebs-csi-driver/values.yaml.gotmpl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ image:
55
tag: "v1.38.1"
66

77
storageClasses:
8-
- name: "ebs-sc"
8+
- name: "{{ .Values.ebsStorageClassName }}"
99
parameters:
1010
type: "gp3"
1111
allowVolumeExpansion: true

charts/longhorn/README.md

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
### Can LH be used for critical services (e.g., Databases)?
44

5-
No (as of now). , we should not use it for volumes of critical services.
5+
No. We should not use it for volumes of critical services.
66

77
As of now, we should avoid using LH for critical services. Instead, we should rely on easier-to-maintain solutions (e.g., application-level replication [Postgres Operators], S3, etc.). Once we get hands-on experience, extensive monitoring and ability to scale LH, we can consider using it for critical services.
88

@@ -25,6 +25,20 @@ Source:
2525
* https://longhorn.io/kb/tip-only-use-storage-on-a-set-of-nodes/
2626
* https://longhorn.io/docs/1.8.1/nodes-and-volumes/nodes/default-disk-and-node-config/#customizing-default-disks-for-new-nodes
2727

28+
### How to configure disks for LH
29+
30+
Manual configuration performed (to be moved to ansible)
31+
1. Create partition on the disk
32+
* e.g. via using `fdisk` https://phoenixnap.com/kb/linux-create-partition
33+
2. Format partition as XFS
34+
* `sudo mkfs.xfs -f /dev/sda1`
35+
3. Mount partition `sudo mount -t xfs /dev/sda1 /longhorn`
36+
4. Persist mount in `/etc/fstab` by adding line
37+
* `UUID=<partition's uuid> /longhorn xfs pquota 0 0`
38+
* UUID can be received from `lsblk -f`
39+
40+
Issue asking LH to clearly document requirements: https://github.com/longhorn/longhorn/issues/11125
41+
2842
### Can workloads be run on nodes where LH is not installed?
2943

3044
Workloads can run on nodes without LH as long as LH is not restricted to specific nodes via the `nodeSelector` or `systemManagedComponentsNodeSelector` settings. If LH is configured to run on specific nodes, workloads can only run on those nodes.
@@ -48,3 +62,22 @@ Insights into LH's performance:
4862

4963
Resource requirements:
5064
* https://github.com/longhorn/longhorn/issues/1691
65+
66+
### (Kubernetes) Node maintenance
67+
68+
https://longhorn.io/docs/1.8.1/maintenance/maintenance/
69+
70+
Note: you can use Longhorn GUI to perform some operations
71+
72+
### Zero downtime updating longhorn disks (procedure)
73+
Notes:
74+
* Update one node at a time so that other nodes can still serve data
75+
76+
1. Go to LH GUI and select a Node
77+
1. Disable scheduling
78+
2. Request eviction
79+
1. Remove disk from the node
80+
* If remove icon is disabled, disable eviction on disk to enable the remove button
81+
2. Perform disks updates on the node
82+
3. Make sure LH didn't pick up wrongly configured disk in the meantime and remove the wrong disk if it did so
83+
4. Wait till LH automatically adds the disk to the Node
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "1Gi" # minimal size for gp3 is 1Gi
4-
storageClass: "ebs-sc"
4+
storageClass: "{{ .Values.ebsStorageClassName }}"
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
persistence:
22
enabled: true
33
size: "300Mi" # cannot be lower https://github.com/longhorn/longhorn/issues/8488
4-
storageClass: "{{.Values.longhornStorageClassName}}"
4+
storageClass: "{{ .Values.longhornStorageClassName }}"

0 commit comments

Comments
 (0)