Skip to content

Commit d791f1a

Browse files
committed
docs: when to update Elastic Agents
1 parent 9426eb6 commit d791f1a

File tree

3 files changed

+17
-5
lines changed

3 files changed

+17
-5
lines changed

docs/infrastructure/components/elastic.agent.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
# Elastic Agent
22

3+
## Table of contents
4+
5+
- [Overview](#overview)
6+
- [Fleet](#fleet)
7+
- [Installation](#installation)
8+
- [Configuration](#configuration)
9+
- [When to update](#when-to-update)
10+
- [Investigating the metrics](#investigating-the-metrics)
11+
12+
## Overview
13+
314
The agent runs as a DaemonSet and collects:
415

516
- Kubernetes logs (not collected in this setup)
@@ -33,6 +44,12 @@ From the standard configuration, the following changes have been made:
3344
- Collect Kubernetes container logs has been de-activated. We already collect these logs using Fluent Bit and we want to avoid duplication.
3445
- Collect Kubernetes events from Kubernetes API Server has been de-activated. We already collect these events using Event exporter and we want to avoid duplication.
3546

47+
## When to update
48+
49+
- Update Elastic Agent whenever the Elastic Stack (e.g. Elastic Cloud) is upgraded to keep versions aligned.
50+
- Perform updates via Helm (code change), not directly in Elastic Fleet, to avoid configuration drift.
51+
- Also update for critical fixes or security advisories from Elastic.
52+
3653
## Investigating the metrics
3754

3855
Index: `metrics-*`

docs/infrastructure/components/fluentbit.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,6 @@ The Fluent Bit application version is stored in `appVersion` but this is only he
6363
```
6464

6565
2. Verify the Fluent Bit pods logs
66-
6766
- Get pod names
6867

6968
```shell

docs/infrastructure/disaster.recovery.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,6 @@ If any of the cluster infrastructure exists but is not functional, see the above
5454
```
5555
5656
3. Deploy Kubernetes components:
57-
5857
1. Connect AWS CLI to the new cluster: `aws eks update-kubeconfig --name=Workflows`.
5958
2. Create the Argo Workflows configuration files: `npx cdk8s synth`.
6059
3. (ONLY IF [RECREATING DATABASE](#rds-database)) Remove the `persistence` section of `dist/0005-argo-workflows.k8s.yaml` to disable workflow archiving to database. For example:
@@ -127,7 +126,6 @@ If there is any issue on the RDS instance that can't be recovered, we might have
127126
2. [Deploy the EKS cluster](#deployment-of-new-cluster)
128127
129128
3. Create a temporary RDS database from [the manual snapshot created](#update-database-version-if-necessary):
130-
131129
1. Get details of the new cluster database: `aws rds describe-db-instances --query="DBInstances[?DBName=='argo'].{EndpointAddress: Endpoint.Address, DBSubnetGroupName: DBSubnetGroup.DBSubnetGroupName, VpcSecurityGroupIds: VpcSecurityGroups[].VpcSecurityGroupId}"`.
132130
2. Go to https://ap-southeast-2.console.aws.amazon.com/rds/home?region=ap-southeast-2#db-snapshot:engine=postgres;id=ID, replacing "ID" with the `DBSnapshotIdentifier` of the manual snapshot.
133131
3. Click on _Actions_ → _Restore snapshot_.
@@ -140,7 +138,6 @@ If there is any issue on the RDS instance that can't be recovered, we might have
140138
10. Wait for the temporary DB to get to the "Available" state.
141139
142140
4. Dump the temporary database to the new Argo database:
143-
144141
1. Submit a ["sleep" workflow](../../workflows/test/sleep.yml) to the new Argo Workflows installation to spin up a pod:
145142
`argo submit --namespace=argo workflows/test/sleep.yml`. This will be used to connect to RDS to dump the database to a file.
146143
2. Connect to the sleep pod (it can take a while for the pod to spin up, so you might have to retry the second command):
@@ -165,7 +162,6 @@ If there is any issue on the RDS instance that can't be recovered, we might have
165162
You will be prompted for a password, get the password from the [AWS Systems Manager Parameter Store](https://ap-southeast-2.console.aws.amazon.com/systems-manager/parameters/%252Feks%252Fargo%252Fpostgres%252Fpassword/description?region=ap-southeast-2&tab=Table).
166163
167164
5. Redeploy the cluster configuration files to enable the connection to the database and turn on workflow archiving:
168-
169165
1. Run `npx cdk8s synth` to recreate the `persistence` section in `dist/0005-argo-workflows.k8s.yaml`.
170166
2. Redeploy the Argo config file: `kubectl replace --filename=dist/0005-argo-workflows.k8s.yaml`.
171167
3. Restart the workflow controller and the server:

0 commit comments

Comments
 (0)