Skip to content

Commit d747285

Browse files
committed
self-service wip
1 parent 9186957 commit d747285

File tree

20 files changed

+1898
-0
lines changed

20 files changed

+1898
-0
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
title: Background
3+
linkTitle: 1 Background
4+
weight: 1
5+
time: 3 minutes
6+
---
7+
8+
## Background
9+
10+
Let's review a few background concepts on **Open Telemetry** before jumping into the details.
11+
12+
First we have the **Open Telemetry Collector**, which lives on hosts or kubernetes nodes. These collectors can collect local information (like cpu, disk, memory, etc.). They can also collect metrics from other sources like prometheus (push or pull) or databases and other middleware.
13+
14+
![OTel Diagram](../images/otel-diagram.svg?width=60vw)
15+
Source: [OTel Documentation](https://opentelemetry.io/docs/)
16+
17+
The way the **OTel Collector** collects and sends data is using **pipelines**. Pipelines are made up of:
18+
* **Receivers**: Collect telemetry from one or more sources; they are pull- or push-based.
19+
* **Processors**: Take data from receivers and modify or transform them. Unlike receivers and exporters, processors process data in a specific order.
20+
* **Exporters**: Send data to one or more observability backends or other destinations.
21+
22+
![OTel Diagram](../images/otel-collector-details.svg?width=60vw)
23+
Source: [OTel Documentation](https://opentelemetry.io/docs/collector/)
24+
25+
The final piece is applications which are instrumented; they will send traces (spans), metrics, and logs.
26+
27+
By default the instrumentation is designed to send data to the local collector (on the host or kubernetes node). This is desirable because we can then add metadata on it -- like which pod or which node/host the application is running on.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
title: Deploy Gateway
3+
linkTitle: 2.1 Deploy Gateway
4+
weight: 1
5+
time: 5 minutes
6+
---
7+
8+
## Gateway
9+
10+
First we will deploy the **OTel Gateway**. The workshop instructor will deploy the gateway, but we will walk through the steps here if you wish to try this yourself on a second instance.
11+
12+
The steps:
13+
* Click the **Data Management** icon in the toolbar
14+
* Click the **+ Add integration** button
15+
* Click **Deploy the Splunk OpenTelemetry Collector** button
16+
* Click **Next**
17+
* Select **Linux**
18+
* Change mode to **Data forwarding (gateway)**
19+
* Set the environment to **prod**
20+
* Choose the access token for this workshop
21+
* Click **Next**
22+
* Copy the installer script and run it in the provided linux environment.
23+
24+
Once our gateway is started we will notice... **Nothing**. The gateway, by default, doesn't send any data. It can be configured to send data, but it doesn't by default.
25+
26+
We can review the config file with:
27+
``` bash
28+
sudo cat /etc/otel/collector/splunk-otel-collector.conf
29+
```
30+
31+
And see that the config being used is `gateway_config.yaml`.
32+
33+
{{% notice title="Tip" style="primary" icon="lightbulb" %}}
34+
Diagrams created with [OTelBin.io](https://www.otelbin.io). Click on them to see them in detail.
35+
{{% /notice %}}
36+
37+
|Diagram|What it Tells Us|
38+
|-|-|
39+
|![metrics Config](../images/metrics.png)|**Metrics**:<br>The gateway will receive metrics over **otlp** or **signalfx** protocols, and then send these metrics to **Splunk Observability Cloud** with the **signalfx** protocol.<br><br>There is also a pipeline for **prometheus metrics** to be sent to Splunk. That pipeline is labeled **internal** and is meant to be for the collector. (In other words if we want to receive prometheus directly we should add it to the main pipeline.)|
40+
|![traces Config](../images/traces.png)|**Traces**:<br>The gateway will receive traces over **jaeger**, **otlp**, **sapm**, or **zipkin** and then send these traces to **Splunk Observability Cloud** with the **sapm** protocol.|
41+
|![logs Config](../images/logs.png)|**Logs**:<br>The gateway will receive logs over **otlp** and then send these logs to 2 places: **Splunk Enterprise (Cloud)** (for logs) and **Splunk Observability Cloud** (for profiling data).<br><br>There is also a pipeline labeled **signalfx** that is sending **signalfx** to **Splunk Observability Cloud**; these are events that can be used to add events to charts, as well as the process list.|
42+
43+
We're not going to see any host metrics, and we aren't send any other data through the gateway yet. But we do have the **internal** metrics being sent in.
44+
45+
You can find it by creating a new chart and adding a metric:
46+
* Click the **+** in the top-right
47+
* Click **Chart**
48+
* For the signal of Plot A, type `otelcol_process_uptime`
49+
* Add a filter with the + to the right, and type: `host.id:<name of instance>`
50+
51+
You should get a chart like the following:
52+
![Chart of gateway](../images/gateway_metric_chart.png)
53+
54+
You can look at the **Metric Finder** to find other internal metrics to explore.
55+
56+
## Add Metadata
57+
58+
Before we deploy a collector (agent) let's add some metada onto metrics and traces with the gateway. That's how we will know data is passing through it.
59+
60+
The [attributes processor](https://docs.splunk.com/observability/en/gdi/opentelemetry/components/attributes-processor.html) let's us add some metadata.
61+
62+
``` bash
63+
sudo vi /etc/otel/collector/agent_config.yaml
64+
```
65+
66+
Here's what we want to add to the processors section:
67+
68+
``` yaml
69+
processors:
70+
attributes/gateway_config:
71+
actions:
72+
- key: gateway
73+
value: oac
74+
action: insert
75+
```
76+
77+
And then to the pipelines (adding `attributes/gateway_config` to each):
78+
``` yaml
79+
service:
80+
pipelines:
81+
traces:
82+
receivers: [jaeger, otlp, smartagent/signalfx-forwarder, zipkin]
83+
processors:
84+
- memory_limiter
85+
- batch
86+
- resourcedetection
87+
- attributes/gateway_config
88+
#- resource/add_environment
89+
exporters: [sapm, signalfx]
90+
# Use instead when sending to gateway
91+
#exporters: [otlp, signalfx]
92+
metrics:
93+
receivers: [hostmetrics, otlp, signalfx, smartagent/signalfx-forwarder]
94+
processors: [memory_limiter, batch, resourcedetection, attributes/gateway_config]
95+
exporters: [signalfx]
96+
# Use instead when sending to gateway
97+
#exporters: [otlp]
98+
```
99+
100+
And finally we need to restart the gateway:
101+
``` bash
102+
sudo systemctl restart splunk-otel-collector.service
103+
```
104+
105+
We can make sure it is still running fine by checking the status:
106+
``` bash
107+
sudo systemctl status splunk-otel-collector.service
108+
```
109+
110+
## Next
111+
112+
Next, let's deploy a collector and then configure it to this gateway.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
title: Deploy Collector (Agent)
3+
linkTitle: 2.2 Deploy Collector (Agent)
4+
weight: 2
5+
time: 10 minutes
6+
---
7+
8+
## Collector (Agent)
9+
10+
Now we will deploy a collector. At first this will be configured to go directly to the back-end, but we will change the configuration and restart the collector to use the gateway.
11+
12+
The steps:
13+
* Click the **Data Management** icon in the toolbar
14+
* Click the **+ Add integration** button
15+
* Click **Deploy the Splunk OpenTelemetry Collector** button
16+
* Click **Next**
17+
* Select **Linux**
18+
* Leave the mode as **Host monitoring (agent)**
19+
* Set the environment to **prod**
20+
* Leave the rest as defaults
21+
* Choose the access token for this workshop
22+
* Click **Next**
23+
* Copy the installer script and run it in the provided linux environment.
24+
25+
This collector is sending host metrics, so you can find it in common navigators:
26+
* Click the **Infrastructure** icon in the toolbar
27+
* Click the **EC2** panel under **Amazon Web Services**
28+
* The `AWSUniqueId` is the easiest thing to find; add a filter and look for it with a wildcard (i.e. `i-0ba6575181cb05226*`)
29+
30+
![Chart of agent](../images/collector_agent_chart.png)
31+
32+
We can also simply look at the `cpu.utilization` metric. Create a new chart to display it, filtered on the AWSUniqueId:
33+
34+
![Chart 2 of agent](../images/collector_agent_chart_2.png)
35+
36+
The reason we wanted to do that is so we can easily see the new dimension added on once we send the collector through the gateway. You can click on the **Data table** to see the dimensions currently being sent:
37+
38+
![Data Table](../images/collector_agent_data_table.png)
39+
40+
## Next
41+
42+
Next we'll reconfigure the collector to send to the gateway.
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: Reconfigure Collector
3+
linkTitle: 2.3 Reconfigure Collector
4+
weight: 3
5+
time: 10 minutes
6+
---
7+
8+
## Reconfigure Collector
9+
10+
To reconfigure the collector we need to make these changes:
11+
* In `agent_config.yaml`
12+
* We need to adjust the **signalfx** exporter to use the gateway
13+
* The **otlp** exporter is already there, so we leave it alone
14+
* We need to change the pipelines to use **otlp**
15+
* In `splunk-otel-collector.conf`
16+
* We need to set the `SPLUNK_GATEWAY_URL` to the url provided by the instructor
17+
18+
See this [docs page](https://docs.splunk.com/observability/en/gdi/opentelemetry/deployment-modes.html#agent-configuration) for more details.
19+
20+
The exporters will be the following:
21+
``` yaml
22+
exporters:
23+
# Metrics + Events
24+
signalfx:
25+
access_token: "${SPLUNK_ACCESS_TOKEN}"
26+
#api_url: "${SPLUNK_API_URL}"
27+
#ingest_url: "${SPLUNK_INGEST_URL}"
28+
# Use instead when sending to gateway
29+
api_url: "http://${SPLUNK_GATEWAY_URL}:6060"
30+
ingest_url: "http://${SPLUNK_GATEWAY_URL}:9943"
31+
sync_host_metadata: true
32+
correlation:
33+
# Send to gateway
34+
otlp:
35+
endpoint: "${SPLUNK_GATEWAY_URL}:4317"
36+
tls:
37+
insecure: true
38+
```
39+
The others you can leave as they are but they won't be used, as you will see in the pipelines.
40+
41+
The pipeline changes (you can see the items commented out and uncommented out):
42+
``` yaml
43+
service:
44+
pipelines:
45+
traces:
46+
receivers: [jaeger, otlp, smartagent/signalfx-forwarder, zipkin]
47+
processors:
48+
- memory_limiter
49+
- batch
50+
- resourcedetection
51+
#- resource/add_environment
52+
#exporters: [sapm, signalfx]
53+
# Use instead when sending to gateway
54+
exporters: [otlp, signalfx]
55+
metrics:
56+
receivers: [hostmetrics, otlp, signalfx, smartagent/signalfx-forwarder]
57+
processors: [memory_limiter, batch, resourcedetection]
58+
#exporters: [signalfx]
59+
# Use instead when sending to gateway
60+
exporters: [otlp]
61+
metrics/internal:
62+
receivers: [prometheus/internal]
63+
processors: [memory_limiter, batch, resourcedetection]
64+
# When sending to gateway, at least one metrics pipeline needs
65+
# to use signalfx exporter so host metadata gets emitted
66+
exporters: [signalfx]
67+
logs/signalfx:
68+
receivers: [signalfx, smartagent/processlist]
69+
processors: [memory_limiter, batch, resourcedetection]
70+
exporters: [signalfx]
71+
logs:
72+
receivers: [fluentforward, otlp]
73+
processors:
74+
- memory_limiter
75+
- batch
76+
- resourcedetection
77+
#- resource/add_environment
78+
#exporters: [splunk_hec, splunk_hec/profiling]
79+
# Use instead when sending to gateway
80+
exporters: [otlp]
81+
```
82+
83+
And finally we can add the `SPLUNK_GATEWAY_URL` in `splunk-otel-collector.conf`, for example:
84+
``` conf
85+
SPLUNK_GATEWAY_URL=gateway.splunk011y.com
86+
```
87+
88+
Then we can restart the collector:
89+
``` bash
90+
sudo systemctl restart splunk-otel-collector.service
91+
```
92+
93+
And check the status:
94+
``` bash
95+
sudo systemctl status splunk-otel-collector.service
96+
```
97+
98+
And finally see the new dimension on the metrics:
99+
![New Dimension](../images/gateway_dimension.png)
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
title: Collect Data with Standards
3+
linkTitle: 2 Collect Data with Standards
4+
weight: 2
5+
time: 10 minutes
6+
---
7+
8+
## Introduction
9+
10+
For this workshop, we'll be doing things that only a central tools or administration would do.
11+
12+
The workshop uses scripts to help with steps that aren't part of the focus of this workshop -- like how to change a kubernetes app, or start an application from a host.
13+
14+
{{% notice title="Tip" style="primary" icon="lightbulb" %}}
15+
It can be useful to review what the scripts are doing.
16+
17+
So along the way it is advised to run `cat <filename>` from time to time to see what that step just did.
18+
19+
The workshop won't call this out, so do it when you are curious.
20+
{{% /notice %}}
21+
22+
We'll also be running some scripts to simulate data that we want to deal with.
23+
24+
A simplified version of the architecture (leaving aside the specifics of kubernetes) will look something like the following:
25+
26+
![Architecture](../images/arch.png)
27+
28+
* The **App** sends metrics and traces to the **Otel Collector**
29+
* The **Otel Collector** also collects metrics of its own
30+
* The **Otel Collector** adds metadata to its own metrics and data that passes through it
31+
* The **OTel Gateway** offers another opportunity to add metadata
32+
33+
Let's start by deploying the gateway.
73.5 KB
Loading
305 KB
Loading
140 KB
Loading
169 KB
Loading
148 KB
Loading

0 commit comments

Comments
 (0)