Skip to content

Commit f41db30

Browse files
committed
Merge remote-tracking branch 'upstream/main'
2 parents 4509be5 + 91fcc61 commit f41db30

File tree

19 files changed

+102
-106
lines changed

19 files changed

+102
-106
lines changed

.bumpversion.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 5.98
2+
current_version = 5.99
33
commit = True
44
Tag = True
55
parse = v?(?P<major>\d+)\.(?P<minor>\d+)

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,5 @@ Unless required by applicable law or agreed to in writing, software distributed
2222
To get started, please proceed to [The Splunk Observability Cloud Workshops Homepage](https://splunk.github.io/observability-workshop/latest/).
2323

2424
Latest versions of the workshop are:
25+
- [v5.99](https://splunk.github.io/observability-workshop/v5.99/)
2526
- [v5.98](https://splunk.github.io/observability-workshop/v5.98/)
26-
- [v5.97](https://splunk.github.io/observability-workshop/v5.97/)

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
5.98
1+
5.99

content/en/conf/1-advanced-collector/1-agent-gateway/1-1-gateway.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,9 @@ This makes your observability pipeline easier to manage, scale, and analyze—es
1515
Open or create your second terminal window and name it **Gateway**. Navigate to the first exercise directory `[WORKSHOP]/1-agent-gateway`
1616
then check the contents of the `gateway.yaml` file.
1717

18-
This file outlines the core structure of the OpenTelemetry Collector as deployed in **Gateway** mode:
18+
This file outlines the core structure of the OpenTelemetry Collector as deployed in **Gateway** mode.
1919

20+
<!--
2021
```bash
2122
cat ./gateway.yaml
2223
```
@@ -105,7 +106,7 @@ service: # Service configuration
105106
- debug # Debug exporter
106107
- file/logs
107108
```
108-
109+
-->
109110
{{% /notice %}}
110111

111112
### Understanding the Gateway Configuration
@@ -132,19 +133,20 @@ Let’s explore the `gateway.yaml` file that defines how the OpenTelemetry Colle
132133
The **Gateway** uses three file exporters to output telemetry data to local files. These exporters are defined as:
133134

134135
```yaml
135-
exporters:
136-
file/traces:
137-
path: ./gateway-traces.out
138-
file/metrics:
139-
path: ./gateway-metrics.out
140-
file/logs:
141-
path: ./gateway-logs.out
136+
exporters: # List of exporters
137+
debug: # Debug exporter
138+
verbosity: detailed # Enable detailed debug output
139+
file/traces: # Exporter Type/Name
140+
path: "./gateway-traces.out" # Path for OTLP JSON output for traces
141+
append: false # Overwrite the file each time
142+
file/metrics: # Exporter Type/Name
143+
path: "./gateway-metrics.out" # Path for OTLP JSON output for metrics
144+
append: false # Overwrite the file each time
145+
file/logs: # Exporter Type/Name
146+
path: "./gateway-logs.out" # Path for OTLP JSON output for logs
147+
append: false # Overwrite the file each time
142148
```
143149

144-
Each exporter writes a specific signal type to its corresponding file:
145-
146-
* `gateway-traces.out`: stores span (trace) data
147-
* `gateway-metrics.out`: stores metric data
148-
* `gateway-logs.out`: stores log data
150+
Each exporter writes a specific signal type to its corresponding file.
149151

150152
These files are created once the gateway is started and will be populated with real telemetry as the agent sends data. You can monitor these files in real time to observe the flow of telemetry through your pipeline.

content/en/conf/1-advanced-collector/1-agent-gateway/1-2-send-metrics.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,22 @@
11
---
2-
title: 1.2 Send Test Metrics
3-
linkTitle: 1.2 Send Metrics
2+
title: 1.2 Validate & Test Configuration
3+
linkTitle: 1.2 Validate & Test Configuration
44
weight: 3
55
---
66

7-
Now, we can start the **Gateway** and the **Agent**, which is configured to automaticly send **Host Metrics** at startup. We do this to verify that data is properly routed from the **Agent** to the **Gateway**.
7+
Now, we can start the **Gateway** and the **Agent**, which is configured to automatically send **Host Metrics** at startup. We do this to verify that data is properly routed from the **Agent** to the **Gateway**.
88

99
{{% notice title="Exercise" style="green" icon="running" %}}
1010

11-
**Start the Gateway**: In the **Gateway terminal** window, run the following command to start the **Gateway**:
11+
**Gateway**: In the **Gateway terminal** window, run the following command to start the **Gateway**:
1212

1313
```bash {title="Start the Gateway"}
1414
../otelcol --config=gateway.yaml
1515
```
1616

17-
If everything is configured correctly, the first and last lines of the output should look like:
17+
If everything is configured correctly, the collector will start and state `Everything is ready. Begin running and processing data.` in the output, similar to the following:
1818

1919
```text
20-
2025/06/09 09:22:11 settings.go:478: Set config to [gateway.yaml]
21-
...
22-
<snip to the end>
23-
...
2420
2025-06-09T09:22:11.944+0100 info [email protected]/service.go:289 Everything is ready. Begin running and processing data. {"resource": {}}
2521
```
2622

@@ -43,22 +39,23 @@ Once the **Gateway** is running, it will listen for incoming data on port `5318`
4339

4440
```text
4541
<snip>
46-
NumberDataPoints #37
42+
NumberDataPoints #31
4743
Data point attributes:
48-
-> cpu: Str(cpu0)
49-
-> state: Str(system)
50-
StartTimestamp: 2024-12-09 14:18:28 +0000 UTC
51-
Timestamp: 2025-01-15 15:27:51.319526 +0000 UTC
52-
Value: 9637.660000
44+
-> cpu: Str(cpu3)
45+
-> state: Str(wait)
46+
StartTimestamp: 2025-07-07 16:49:42 +0000 UTC
47+
Timestamp: 2025-07-09 09:36:21.190226459 +0000 UTC
48+
Value: 77.380000
49+
{"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "exporter", "otelcol.signal": "metrics"}
5350
```
5451

55-
At this stage, the **Agent** continues to collect **CPU** metrics once per hour or upon each restart and sends them to the gateway. The **Gateway** processes these metrics and exports them to a file named `./gateway-metrics.out`. This file stores the exported metrics as part of the pipeline service.
52+
At this stage, the **Agent** continues to collect **CPU** metrics once per hour or upon each restart and sends them to the gateway. The **Gateway** processes these metrics and exports them to a file named `gateway-metrics.out`. This file stores the exported metrics as part of the pipeline service.
5653

5754
**Verify Data arrived at Gateway**: To confirm that CPU metrics, specifically for `cpu0`, have successfully reached the gateway, we’ll inspect the `gateway-metrics.out` file using the `jq` command.
5855

5956
The following command filters and extracts the `system.cpu.time` metric, focusing on `cpu0`. It displays the metric’s state (e.g., `user`, `system`, `idle`, `interrupt`) along with the corresponding values.
6057

61-
Run the command below in the **Tests terminal** to check the `system.cpu.time` metric:
58+
Open or create your third terminal window and name it **Tests**. Run the command below in the **Tests terminal** to check the `system.cpu.time` metric:
6259

6360
{{% tabs %}}
6461
{{% tab title="Check CPU Metrics" %}}
@@ -96,4 +93,7 @@ jq '.resourceMetrics[].scopeMetrics[].metrics[] | select(.name == "system.cpu.ti
9693
{{% /tab %}}
9794
{{% /tabs %}}
9895

96+
> [!IMPORTANT]
97+
> Stop the **Agent** and the **Gateway** processes by pressing `Ctrl-C` in their respective terminals.
98+
9999
{{% /notice %}}

content/en/conf/1-advanced-collector/1-agent-gateway/_index.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ We will refer to these terminals as: **Agent**, **Gateway**, **Loadgen**, and **
3232
├── agent.yaml
3333
└── gateway.yaml
3434
```
35-
35+
<!--
3636
3. Check the contents of the **agent.yaml** file. This file outlines the core structure of the OpenTelemetry Collector as deployed in **Agent** mode:
3737
3838
```bash
@@ -130,7 +130,7 @@ We will refer to these terminals as: **Agent**, **Gateway**, **Loadgen**, and **
130130
- file
131131
- otlphttp
132132
```
133-
133+
-->
134134
{{% /notice %}}
135135
136136
### Understanding the Agent configuration
@@ -179,18 +179,20 @@ The `receivers` section defines how the **Agent** ingests telemetry data. In thi
179179
180180
#### Exporters
181181
182-
* The `exporters` section controls where the collected telemetry data is sent:
182+
* **Debug Exporter**
183183
184184
```yaml
185-
exporters: # Array of Exporters
186185
debug: # Exporter Type
187186
verbosity: detailed # Enabled detailed debug output
187+
```
188+
189+
* **OTLPHTTP Exporter**
190+
191+
```yaml
188192
otlphttp: # Exporter Type
189193
endpoint: "http://localhost:5318" # Gateway OTLP endpoint
190194
```
191195
192196
The `debug` exporter sends data to the console for visibility and debugging during the workshop while the `otlphttp` exporter forwards all telemetry to the local **Gateway** instance.
193197
194-
{{% notice title="Info" style="info" %}}
195-
This dual-export strategy ensures you can see the raw data locally while also sending it downstream for further processing and export.
196-
{{% /notice %}}
198+
**This dual-export strategy ensures you can see the raw data locally while also sending it downstream for further processing and export.**

content/en/conf/1-advanced-collector/2-building-resilience/2-1-configuration.md

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,18 @@ While these components do not process telemetry data directly, they provide valu
1010

1111
{{% notice title="Exercise" style="green" icon="running" %}}
1212

13-
**Update the `agent.yaml`**: In the **Agent terminal** window, add the `file_storage` extension and name it `checkpoint`:
13+
> [!IMPORTANT]
14+
> **Change _ALL_ terminal windows to the `2-building-resilience` directory and run the `clear` command.**
15+
16+
Your directory structure will look like this:
17+
18+
```text { title="Updated Directory Structure" }
19+
.
20+
├── agent.yaml
21+
└── gateway.yaml
22+
```
23+
24+
**Update the `agent.yaml`**: In the **Agent terminal** window, add the `file_storage` extension under the existing `health_check` extension:
1425

1526
```yaml
1627
file_storage/checkpoint: # Extension Type/Name
@@ -24,11 +35,9 @@ While these components do not process telemetry data directly, they provide valu
2435
max_transaction_size: 65536 # Max. size limit before compaction occurs
2536
```
2637
27-
**Add `file_storage` to existing `otlphttp` exporter**: Modify the `otlphttp:` exporter to configure retry and queuing mechanisms, ensuring data is retained and resent if failures occur:
38+
**Add `file_storage` to the exporter**: Modify the `otlphttp` exporter to configure retry and queuing mechanisms, ensuring data is retained and resent if failures occur. Add the following under the `endpoint: "http://localhost:5318"` and make sure the indentation matches `endpoint`:
2839

2940
```yaml
30-
otlphttp:
31-
endpoint: "http://localhost:5318"
3241
retry_on_failure:
3342
enabled: true # Enable retry on failure
3443
sending_queue: #
@@ -38,7 +47,7 @@ While these components do not process telemetry data directly, they provide valu
3847
storage: file_storage/checkpoint # File storage extension
3948
```
4049

41-
**Update the `services` section**: Add the `file_storage/checkpoint` extension to the existing `extensions:` section. This will cause the extension to be enabled:
50+
**Update the `services` section**: Add the `file_storage/checkpoint` extension to the existing `extensions:` section and the configuration needs to look like this:
4251

4352
```yaml
4453
service:
@@ -47,18 +56,18 @@ service:
4756
- file_storage/checkpoint # Enabled extensions for this collector
4857
```
4958

50-
**Update the `metrics` pipeline**: For this exercise we are going to comment out the `hostmetrics` receiver from the Metric pipeline to reduce debug and log noise:
59+
**Update the `metrics` pipeline**: For this exercise we are going to comment out the `hostmetrics` receiver from the Metric pipeline to reduce debug and log noise, again the configuration needs to look like this:
5160

5261
```yaml
5362
metrics:
5463
receivers:
64+
# - hostmetrics # Hostmetric reciever (cpu only)
5565
- otlp
56-
# - hostmetrics # Hostmetrics Receiver
5766
```
5867

5968
{{% /notice %}}
6069

61-
Validate the **Agent** configuration using **[otelbin.io](https://www.otelbin.io/)**. For reference, the `metrics:` section of your pipelines will look similar to this:
70+
<!-- Validate the **Agent** configuration using **[otelbin.io](https://www.otelbin.io/)**. For reference, the `metrics:` section of your pipelines will look similar to this:
6271

6372
```mermaid
6473
%%{init:{"fontFamily":"monospace"}}%%
@@ -76,16 +85,16 @@ graph LR
7685
subgraph " "
7786
subgraph subID1[**Metrics**]
7887
direction LR
79-
REC1 --> PRO1
80-
PRO1 --> PRO2
81-
PRO2 --> PRO3
82-
PRO3 --> PRO4
83-
PRO4 --> EXP1
84-
PRO4 --> EXP2
88+
REC1 -- > PRO1
89+
PRO1 -- > PRO2
90+
PRO2 -- > PRO3
91+
PRO3 -- > PRO4
92+
PRO4 -- > EXP1
93+
PRO4 -- > EXP2
8594
end
8695
end
8796
classDef receiver,exporter fill:#8b5cf6,stroke:#333,stroke-width:1px,color:#fff;
8897
classDef processor fill:#6366f1,stroke:#333,stroke-width:1px,color:#fff;
8998
classDef con-receive,con-export fill:#45c175,stroke:#333,stroke-width:1px,color:#fff;
9099
classDef sub-metrics stroke:#38bdf8,stroke-width:1px, color:#38bdf8,stroke-dasharray: 3 3;
91-
```
100+
``` -->

content/en/conf/1-advanced-collector/2-building-resilience/2-2-test-environment.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,26 +8,25 @@ Next, we will configure our environment to be ready for testing the **File Stora
88

99
{{% notice title="Exercise" style="green" icon="running" %}}
1010

11-
**Start the Gateway**: In the **Gateway terminal** window navigate to the `[WORKSHOP]/4-resilience` directory and run:
11+
**Start the Gateway**: In the **Gateway terminal** window run:
1212

1313
```bash { title="Start the Gateway" }
1414
../otelcol --config=gateway.yaml
1515
```
1616

17-
**Start the Agent**: In the **Agent terminal** window navigate to the `[WORKSHOP]/4-resilience` directory and run:
17+
**Start the Agent**: In the **Agent terminal** window run:
1818

1919
```bash { title="Start the Agent" }
2020
../otelcol --config=agent.yaml
2121
```
2222

23-
**Send five test spans**: In the **Loadgen terminal** window navigate to the `[WORKSHOP]/4-resilience` directory and run:
23+
**Send five test spans**: In the **Loadgen terminal** window run:
2424

2525
```bash { title="Start Load Generator" }
2626
../loadgen -count 5
2727
```
2828

2929
Both the **Agent** and **Gateway** should display debug logs, and the **Gateway** should create a `./gateway-traces.out` file.
3030

31-
{{% /notice %}}
32-
3331
If everything functions correctly, we can proceed with testing system resilience.
32+
{{% /notice %}}

content/en/conf/1-advanced-collector/2-building-resilience/2-3-failure.md

Lines changed: 4 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,12 @@ weight: 3
66

77
To assess the **Agent's** resilience, we'll simulate a temporary **Gateway** outage and observe how the **Agent** handles it:
88

9-
**Summary**:
10-
11-
1. **Send Traces to the Agent** – Generate traffic by sending traces to the **Agent**.
12-
2. **Stop the Gateway** – This will trigger the **Agent** to enter retry mode.
13-
3. **Restart the Gateway** – The **Agent** will recover traces from its persistent queue and forward them successfully. Without the persistent queue, these traces would have been lost permanently.
14-
159
{{% notice title="Exercise" style="green" icon="running" %}}
1610

17-
**Simulate a network failure**: In the **Gateway terminal** stop the **Gateway** with `Ctrl-C` and wait until the gateway console shows that it has stopped:
11+
**Simulate a network failure**: In the **Gateway terminal** stop the **Gateway** with `Ctrl-C` and wait until the gateway console shows that it has stopped. The **Agent** will continue running, but it will not be able to send data to the gateway. The output in the **Gateway terminal** should look similar to this:
1812

1913
```text
20-
2025-01-28T13:24:32.785+0100 info service@v0.120.0/service.go:309 Shutdown complete.
14+
2025-07-09T10:22:37.941Z info service@v0.126.0/service.go:345 Shutdown complete. {"resource": {}}
2115
```
2216

2317
**Send traces**: In the **Loadgen terminal** window send five more traces using the `loadgen`.
@@ -31,16 +25,13 @@ Notice that the agent’s retry mechanism is activated as it continuously attemp
3125
**Stop the Agent**: In the **Agent terminal** window, use `Ctrl-C` to stop the agent. Wait until the agent’s console confirms it has stopped:
3226

3327
```text
34-
2025-01-28T14:40:28.702+0100 info extensions/extensions.go:66 Stopping extensions...
35-
2025-01-28T14:40:28.702+0100 info [email protected]/service.go:309 Shutdown complete.
28+
2025-07-09T10:25:59.344Z info [email protected]/service.go:345 Shutdown complete. {"resource": {}}
3629
```
3730

3831
{{% /notice %}}
3932

40-
{{% notice title="Tip" style="primary" icon="lightbulb" %}}
41-
Stopping the agent will halt its retry attempts and prevent any future retry activity.
33+
By stopping the agent will halt its retry attempts and prevent any future retry activity.
4234

4335
If the agent runs for too long without successfully delivering data, it may begin dropping traces, depending on the retry configuration, to conserve memory. By stopping the agent, any metrics, traces, or logs currently stored in memory are lost before being dropped, ensuring they remain available for recovery.
4436

4537
This step is essential for clearly observing the recovery process when the agent is restarted.
46-
{{% /notice %}}

content/en/conf/1-advanced-collector/2-building-resilience/_index.md

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -16,18 +16,3 @@ This solution will work for metrics as long as the connection downtime is brief
1616
For logs, there are plans to implement a more enterprise-ready solution in one of the upcoming Splunk OpenTelemetry Collector releases.
1717

1818
{{% /notice %}}
19-
20-
{{% notice title="Exercise" style="green" icon="running" %}}
21-
22-
> [!IMPORTANT]
23-
> **Change _ALL_ terminal windows to the `[WORKSHOP]/2-building-resilience` directory.**
24-
25-
Your directory structure will look like this:
26-
27-
```text { title="Updated Directory Structure" }
28-
.
29-
├── agent.yaml
30-
└── gateway.yaml
31-
```
32-
33-
{{% /notice %}}

0 commit comments

Comments
 (0)