Skip to content

Commit a2749ff

Browse files
authored
Merge pull request #331 from dmitchsplunk/main
Add the docker-k8s-otel and solving-problems-with-o11y-cloud workshops
2 parents a0b7ff7 + 02955fc commit a2749ff

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+2963
-0
lines changed
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
title: Connect to EC2 Instance
3+
linkTitle: 1. Connect to EC2 Instance
4+
weight: 1
5+
time: 5 minutes
6+
---
7+
8+
## Connect to your EC2 Instance
9+
10+
We’ve prepared an Ubuntu Linux instance in AWS/EC2 for each attendee.
11+
12+
Using the IP address and password provided by your instructor, connect to your EC2 instance
13+
using one of the methods below:
14+
15+
* Mac OS / Linux
16+
* ssh splunk@IP address
17+
* Windows 10+
18+
* Use the OpenSSH client
19+
* Earlier versions of Windows
20+
* Use Putty
21+
Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
---
2+
title: Troubleshoot OpenTelemetry Collector Issues
3+
linkTitle: 10. Troubleshoot OpenTelemetry Collector Issues
4+
weight: 10
5+
time: 20 minutes
6+
---
7+
8+
In the previous section, we added the debug exporter to the collector configuration,
9+
and made it part of the pipeline for traces and logs. We see the debug output
10+
written to the agent collector logs as expected.
11+
12+
However, traces are no longer sent to o11y cloud. Let's figure out why and fix it.
13+
14+
## Review the Collector Config
15+
16+
Whenever a change to the collector config is made via a `values.yaml` file, it's helpful
17+
to review the actual configuration applied to the collector by looking at the config map:
18+
19+
``` bash
20+
kubectl describe cm splunk-otel-collector-otel-agent
21+
```
22+
23+
Let's review the traces pipeline in the agent collector config. It should look
24+
like this:
25+
26+
``` yaml
27+
pipelines:
28+
...
29+
traces:
30+
exporters:
31+
- debug
32+
processors:
33+
- memory_limiter
34+
- k8sattributes
35+
- batch
36+
- resourcedetection
37+
- resource
38+
- resource/add_environment
39+
receivers:
40+
- otlp
41+
- jaeger
42+
- smartagent/signalfx-forwarder
43+
- zipkin
44+
```
45+
46+
Do you see the problem? Only the debug exporter is included in the traces pipeline.
47+
The `otlphttp` and `signalfx` exporters that were present in the configuration previously are gone.
48+
This is why we no longer see traces in o11y cloud.
49+
50+
> How did we know what specific exporters were included before? To find out,
51+
> we could have reverted our earlier customizations and then checked the config
52+
> map to see what was in the traces pipeline originally. Alternatively, we can refer
53+
> to the examples in the [GitHub repo for splunk-otel-collector-chart](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/examples/default/rendered_manifests/configmap-agent.yaml)
54+
> which shows us what default agent config is used by the Helm chart.
55+
56+
## How did the otlphttp and signalfx exporters get removed?
57+
58+
Let's review the customizations we added to the `values.yaml` file:
59+
60+
``` yaml
61+
...
62+
agent:
63+
config:
64+
exporters:
65+
debug:
66+
verbosity: detailed
67+
service:
68+
pipelines:
69+
traces:
70+
exporters:
71+
- debug
72+
logs:
73+
exporters:
74+
- debug
75+
processors:
76+
- memory_limiter
77+
- batch
78+
- resourcedetection
79+
- resource
80+
receivers:
81+
- otlp
82+
```
83+
84+
When we applied the `values.yaml` file to the collector using `helm upgrade`, the
85+
custom configuration got merged with the previous collector configuration.
86+
When this happens, the sections of the `yaml` configuration that contain lists,
87+
such as the list of exporters in the pipeline section, get replaced with what we
88+
included in the `values.yaml` file (which was only the debug exporter).
89+
90+
## Let's Fix the Issue
91+
92+
So when customizing an existing pipeline, we need to fully redefine that part of the configuration.
93+
Our `values.yaml` file should thus be updated as follows:
94+
95+
``` yaml
96+
splunkObservability:
97+
realm: us1
98+
accessToken: ***
99+
infrastructureMonitoringEventsEnabled: true
100+
clusterName: $INSTANCE-cluster
101+
environment: otel-$INSTANCE
102+
agent:
103+
config:
104+
exporters:
105+
debug:
106+
verbosity: detailed
107+
service:
108+
pipelines:
109+
traces:
110+
exporters:
111+
- otlphttp
112+
- signalfx
113+
- debug
114+
logs:
115+
exporters:
116+
- debug
117+
processors:
118+
- memory_limiter
119+
- batch
120+
- resourcedetection
121+
- resource
122+
receivers:
123+
- otlp
124+
```
125+
126+
Let's apply the changes:
127+
128+
``` bash
129+
helm upgrade splunk-otel-collector -f values.yaml \
130+
splunk-otel-collector-chart/splunk-otel-collector
131+
```
132+
133+
And then check the agent config map:
134+
135+
``` bash
136+
kubectl describe cm splunk-otel-collector-otel-agent
137+
```
138+
139+
This time, we should see a fully defined exporters pipeline for traces:
140+
141+
``` bash
142+
pipelines:
143+
...
144+
traces:
145+
exporters:
146+
- otlphttp
147+
- signalfx
148+
- debug
149+
processors:
150+
...
151+
```
152+
153+
## Reviewing the Log Output
154+
155+
The **Splunk Distribution of OpenTelemetry .NET** automatically exports logs enriched with tracing context
156+
from applications that use `Microsoft.Extensions.Logging` for logging (which our sample app does).
157+
158+
Application logs are enriched with tracing metadata and then exported to a local instance of
159+
the OpenTelemetry Collector in OTLP format.
160+
161+
Let's take a closer look at the logs that were captured by the debug exporter to see if that's happening.
162+
To tail the collector logs, we can use the following command:
163+
164+
``` bash
165+
kubectl logs -l component=otel-collector-agent -f
166+
```
167+
168+
Once we're tailing the logs, we can use curl to generate some more traffic. Then we should see
169+
something like the following:
170+
171+
````
172+
2024-12-20T21:56:30.858Z info Logs {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
173+
2024-12-20T21:56:30.858Z info ResourceLog #0
174+
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
175+
Resource attributes:
176+
-> splunk.distro.version: Str(1.8.0)
177+
-> telemetry.distro.name: Str(splunk-otel-dotnet)
178+
-> telemetry.distro.version: Str(1.8.0)
179+
-> os.type: Str(linux)
180+
-> os.description: Str(Debian GNU/Linux 12 (bookworm))
181+
-> os.build_id: Str(6.8.0-1021-aws)
182+
-> os.name: Str(Debian GNU/Linux)
183+
-> os.version: Str(12)
184+
-> host.name: Str(derek-1)
185+
-> process.owner: Str(app)
186+
-> process.pid: Int(1)
187+
-> process.runtime.description: Str(.NET 8.0.11)
188+
-> process.runtime.name: Str(.NET)
189+
-> process.runtime.version: Str(8.0.11)
190+
-> container.id: Str(5bee5b8f56f4b29f230ffdd183d0367c050872fefd9049822c1ab2aa662ba242)
191+
-> telemetry.sdk.name: Str(opentelemetry)
192+
-> telemetry.sdk.language: Str(dotnet)
193+
-> telemetry.sdk.version: Str(1.9.0)
194+
-> service.name: Str(helloworld)
195+
-> deployment.environment: Str(otel-derek-1)
196+
-> k8s.node.name: Str(derek-1)
197+
-> k8s.cluster.name: Str(derek-1-cluster)
198+
ScopeLogs #0
199+
ScopeLogs SchemaURL:
200+
InstrumentationScope HelloWorldController
201+
LogRecord #0
202+
ObservedTimestamp: 2024-12-20 21:56:28.486804 +0000 UTC
203+
Timestamp: 2024-12-20 21:56:28.486804 +0000 UTC
204+
SeverityText: Information
205+
SeverityNumber: Info(9)
206+
Body: Str(/hello endpoint invoked by {name})
207+
Attributes:
208+
-> name: Str(Kubernetes)
209+
Trace ID: 78db97a12b942c0252d7438d6b045447
210+
Span ID: 5e9158aa42f96db3
211+
Flags: 1
212+
{"kind": "exporter", "data_type": "logs", "name": "debug"}
213+
````
214+
215+
In this example, we can see that the Trace ID and Span ID were automatically written to the log output
216+
by the OpenTelemetry .NET instrumentation. This allows us to correlate logs with traces in
217+
Splunk Observability Cloud.
218+
219+
You might remember though that if we deploy the OpenTelemetry collector in a K8s cluster using Helm,
220+
and we include the log collection option, then the OpenTelemetry collector will use the File Log receiver
221+
to automatically capture any container logs.
222+
223+
This would result in duplicate logs being captured for our application. How do we avoid this?
224+
225+
## Avoiding Duplicate Logs in K8s
226+
227+
To avoid capturing duplicate logs, we have one of two options:
228+
229+
1. We can set the `OTEL_LOGS_EXPORTER` environment variable to `none`, to tell the Splunk Distribution of OpenTelemetry .NET to avoid exporting logs to the collector using OTLP.
230+
2. We can manage log ingestion using annotations.
231+
232+
### Option 1
233+
234+
Setting the `OTEL_LOGS_EXPORTER` environment variable to `none` is straightforward. However, the Trace ID and Span ID are not written to the stdout logs generated by the application,
235+
which would prevent us from correlating logs with traces.
236+
237+
To resolve this, we could define a custom logger, such as the example defined in
238+
`/home/splunk/workshop/docker-k8s-otel/helloworld/SplunkTelemetryConfigurator.cs`.
239+
240+
We could include this in our application by updating the `Program.cs` file as follows:
241+
242+
``` cs
243+
using SplunkTelemetry;
244+
using Microsoft.Extensions.Logging.Console;
245+
246+
var builder = WebApplication.CreateBuilder(args);
247+
248+
builder.Services.AddControllers();
249+
250+
SplunkTelemetryConfigurator.ConfigureLogger(builder.Logging);
251+
252+
var app = builder.Build();
253+
254+
app.MapControllers();
255+
256+
app.Run();
257+
```
258+
259+
### Option 2
260+
261+
Option 2 requires updating the deployment manifest for the application
262+
to include an annotation. In our case, we would edit the `deployment.yaml` file to add the
263+
`splunk.com/exclude` annotation as follows:
264+
265+
``` yaml
266+
apiVersion: apps/v1
267+
kind: Deployment
268+
metadata:
269+
name: helloworld
270+
spec:
271+
selector:
272+
matchLabels:
273+
app: helloworld
274+
replicas: 1
275+
template:
276+
metadata:
277+
labels:
278+
app: helloworld
279+
annotations:
280+
splunk.com/exclude: "true"
281+
spec:
282+
containers:
283+
...
284+
```
285+
286+
Please refer to [Managing Log Ingestion by Using Annotations](https://docs.splunk.com/observability/en/gdi/opentelemetry/collector-kubernetes/kubernetes-config-logs.html#manage-log-ingestion-using-annotations)
287+
for further details on this option.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
---
2+
title: Summary
3+
linkTitle: 11. Summary
4+
weight: 11
5+
time: 2 minutes
6+
---
7+
8+
This workshop provided hands-on experience with the following concepts:
9+
10+
* How to deploy the **Splunk Distribution of the OpenTelemetry Collector** on a Linux host.
11+
* How to instrument a .NET application with the **Splunk Distribution of OpenTelemetry .NET**.
12+
* How to "dockerize" a .NET application and instrument it with the **Splunk Distribution of OpenTelemetry .NET**.
13+
* How to deploy the **Splunk Distribution of the OpenTelemetry Collector** in a Kubernetes cluster using Helm.
14+
* How to customize the collector configuration and troubleshoot an issue.
15+
16+
To see how other languages and environments are instrumented with OpenTelemetry,
17+
explore the [Splunk OpenTelemetry Examples GitHub repository](https://github.com/signalfx/splunk-opentelemetry-examples).
18+

0 commit comments

Comments
 (0)