Skip to content

Commit af6c4c9

Browse files
committed
updated tagging workshop
1 parent bb932c9 commit af6c4c9

14 files changed

+136
-66
lines changed

content/en/scenarios/5-understand-impact/1-build-application.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@ linkTitle: 5.1 Build the Sample Application
44
weight: 1
55
---
66

7+
{{% badge icon="clock" color="#ed0090" %}}10 minutes{{% /badge %}}
8+
9+
## Introduction
10+
711
For this workshop, we'll be using a microservices-based application. This application is for an online retailer and normally includes more than a dozen services. However, to keep the workshop simple, we'll be focusing on two services used by the retailer as part of their payment processing workflow: the credit check service and the credit processor service.
812

913
## Pre-requisites
@@ -14,9 +18,9 @@ You will start with an EC2 environment that already has some useful components,
1418
* Deploy a load generator to send traffic to the services
1519

1620
## Initial Steps
17-
To begin the exercise you will need a Splunk Observablity Cloud environment that you can send data to. For this environment you'll need:
21+
To begin the exercise you will need a **Splunk Observablity Cloud** environment that you can send data to. For this environment you'll need:
1822

19-
* The realm (i.e. us1)
23+
* The realm (i.e. `us1`)
2024
* An access token
2125

2226
The initial setup can be completed by executing the following steps on the command line of your EC2 instance, which runs Ubuntu 22.04:
@@ -37,15 +41,15 @@ cd observability-workshop/workshop/tagging
3741

3842
## View your application in Splunk Observability Cloud
3943

40-
Now that the setup is complete, let's confirm that it's sending data to Splunk Observability Cloud.
44+
Now that the setup is complete, let's confirm that it's sending data to **Splunk Observability Cloud**.
4145

42-
Navigate to APM, then use the Environment dropdown to select your environment (i.e. tagging-workshop-name).
46+
Navigate to APM, then use the Environment dropdown to select your environment (i.e. `tagging-workshop-name`).
4347

44-
If everything was deployed correctly, you should see creditprocessorservice and creditcheckservice displayed in the list of services:
48+
If everything was deployed correctly, you should see `creditprocessorservice` and `creditcheckservice` displayed in the list of services:
4549

4650
![APM Overview](../images/apm_overview.png)
4751

48-
Click on Explore on the right-hand side to view the service map. We can see that the creditcheckservice makes calls to the creditprocessorservice, with an average response time of around 3.5 seconds:
52+
Click on Explore on the right-hand side to view the service map. We can see that the `creditcheckservice` makes calls to the `creditprocessorservice`, with an average response time of around 3.5 seconds:
4953

5054
![Service Map](../images/service_map.png)
5155

@@ -57,13 +61,13 @@ You'll also notice that some traces have errors:
5761

5862
![Traces](../images/traces_with_errors.png)
5963

60-
Sort the traces by duration then click on one of the longer running traces. In this example, the trace took five seconds, and we can see that most of the time was spent calling the /runCreditCheck operation, which is part of the creditprocessorservice.
64+
Sort the traces by duration then click on one of the longer running traces. In this example, the trace took five seconds, and we can see that most of the time was spent calling the `/runCreditCheck` operation, which is part of the `creditprocessorservice`.
6165

6266
![Long Running Trace](../images/long_running_trace.png)
6367

6468
Currently, we don't have enough details in our traces to understand why some requests finish in a few milliseconds, and others take several seconds. To provide the best possible customer experience, this will be critical for us to understand.
6569

66-
We also don't have enough information to understand why some requests result in errors, and others don't. For example, if we look at one of the error traces, we can see that the error occurs when the creditprocessorservice attempts to call another service named "otherservice". But why do some requests results in a call to otherservice, and others don't?
70+
We also don't have enough information to understand why some requests result in errors, and others don't. For example, if we look at one of the error traces, we can see that the error occurs when the `creditprocessorservice` attempts to call another service named `otherservice`. But why do some requests results in a call to `otherservice`, and others don't?
6771

6872
![Long Running Trace](../images/error_trace.png)
6973

content/en/scenarios/5-understand-impact/2-what-are-tags.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ linkTitle: 5.2 What are Tags?
44
weight: 2
55
---
66

7+
{{% badge icon="clock" style="primary" %}}3 minutes{{% /badge %}}
8+
79
To understand why some requests have errors or slow performance, we'll need to add context to our traces. We'll do this by adding tags.
810

911
## What are tags?
@@ -25,7 +27,7 @@ A note about terminology before we proceed. While this workshop is about **tags*
2527

2628
## What are tags so important?
2729

28-
Tags are essential for an application to be truly observable. As we saw with our credit score application, some users are having a great experience: fast with no errors. But other users get a slow experience or encounter errors.
30+
Tags are essential for an application to be truly observable. As we saw with our credit check service, some users are having a great experience: fast with no errors. But other users get a slow experience or encounter errors.
2931

3032
Tags add the context to the traces to help us understand why some users get a great experience and others don't. And powerful features in **Splunk Observability Cloud** utilize tags to help you jump quickly to root cause.
3133

content/en/scenarios/5-understand-impact/3-capture-tags.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,13 @@ title: Capture Tags with OpenTelemetry
33
linkTitle: 5.3 Capture Tags with OpenTelemetry
44
weight: 3
55
---
6+
{{% badge icon="clock" style="primary" %}}15 minutes{{% /badge %}}
67

78
Let's add some tags to our traces, so we can find out why some customers receive a poor experience from our application.
89

910
## Identify Useful Tags
1011

11-
We'll start by reviewing the code for the **credit_check** function of **creditcheckservice** (which can be found in the **main.py** file):
12+
We'll start by reviewing the code for the `credit_check` function of `creditcheckservice` (which can be found in the `main.py` file):
1213

1314
````
1415
def credit_check():
@@ -28,13 +29,13 @@ def credit_check():
2829

2930
We can see that this function accepts a **customer number** as an input. This would be helpful to capture as part of a trace. What else would be helpful?
3031

31-
Well, the **credit score** returned for this customer by the **creditprocessorservice** may be interesting (we want to ensure we don't capture any PII data though). It would also be helpful to capture the **credit score category**, and the **credit check result**.
32+
Well, the **credit score** returned for this customer by the `creditprocessorservice` may be interesting (we want to ensure we don't capture any PII data though). It would also be helpful to capture the **credit score category**, and the **credit check result**.
3233

3334
Great, we've identified four tags to capture from this service that could help with our investigation. But how do we capture these?
3435

3536
## Capture Tags
3637

37-
We start by adding importing the trace module by adding an import statement to the top of the creditcheckservice/main.py file:
38+
We start by adding importing the trace module by adding an import statement to the top of the `creditcheckservice/main.py` file:
3839

3940
````
4041
import requests
@@ -82,7 +83,7 @@ def credit_check():
8283

8384
## Redeploy Service
8485

85-
Once these changes are made, let's run the following script to rebuild the Docker image used for creditcheckservice and redeploy it to our Kubernetes cluster:
86+
Once these changes are made, let's run the following script to rebuild the Docker image used for `creditcheckservice` and redeploy it to our Kubernetes cluster:
8687

8788
````
8889
./5-redeploy-creditcheckservice.sh

content/en/scenarios/5-understand-impact/4-explore-trace-data.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,13 @@ linkTitle: 5.4 Explore Trace Data
44
weight: 4
55
---
66

7+
{{% badge icon="clock" style="primary" %}}5 minutes{{% /badge %}}
8+
79
Now that we've captured several tags from our application, lets explore some of the trace data we've captured that include this additional context, and see if we can identify what's causing poor user experience in some cases.
810

911
## Use Trace Analyzer
1012

11-
Navigate to **APM**, then select **Traces**. This takes us to the **Trace Analyzer**, where we can add filters to search for traces of interest. For example, we can filter on traces where the credit score starts with "7":
13+
Navigate to **APM**, then select **Traces**. This takes us to the **Trace Analyzer**, where we can add filters to search for traces of interest. For example, we can filter on traces where the credit score starts with `7`:
1214

1315
![Credit Check Starts with Seven](../images/credit_score_starts_with_seven.png)
1416

@@ -18,12 +20,12 @@ We can apply similar filters for the customer number, credit score category, and
1820

1921
## Explore Traces With Errors
2022

21-
Let's remove the credit score filter and toggle "Errors only" to on, which results in a list of only those traces where an error occurred:
23+
Let's remove the credit score filter and toggle **Errors only** to on, which results in a list of only those traces where an error occurred:
2224

2325
![Traces with Errors Only](../images/traces_errors_only.png)
2426

2527
Click on a few of these traces, and look at the tags we captured. Do you notice any patterns?
2628

2729
If you found a pattern - great job! But keep in mind that this is a difficult way to troubleshoot, as it requires you to look through many traces and remember what you saw in each one to see if you can identify a pattern.
2830

29-
Thankfully, Splunk Observability cloud provides a more efficient way to do this, which we'll explore next.
31+
Thankfully, **Splunk Observability Cloud** provides a more efficient way to do this, which we'll explore next.

content/en/scenarios/5-understand-impact/5-index-tags.md

Lines changed: 25 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@ linkTitle: 5.5 Index Tags
44
weight: 5
55
---
66

7+
{{% badge icon="clock" style="primary" %}}5 minutes{{% /badge %}}
8+
79
## Index Tags
810
To use advanced features in **Splunk Observability Cloud** such as **Tag Spotlight**, we'll need to first index one or more tags.
911

1012
To do this, navigate to **Settings** -> **APM MetricSets**. Then click the **+ New MetricSet** button.
1113

12-
Let's index the **credit.score.category** tag to start with, by filling in the following details:
14+
Let's index the `credit.score.category` tag to start with by providing the following details:
1315

1416
![Create Troubleshooting MetricSet](../images/create_troubleshooting_metric_set.png)
1517

@@ -19,35 +21,45 @@ The tag will appear in the list of **Pending MetricSets** while analysis is perf
1921

2022
![Pending MetricSets](../images/pending_metric_set.png)
2123

22-
Once analysis is complete, click on the checkbox in the **Actions** column.
24+
Once analysis is complete, click on the checkmark in the **Actions** column.
2325

2426
![MetricSet Configuraiton Applied](../images/metricset_config_applied.png)
2527

2628
## How to choose tags for indexing
2729

28-
Why did we choose to index the **credit.score.category** tag and not the others?
30+
Why did we choose to index the `credit.score.category` tag and not the others?
2931

30-
To understand this, it’s helpful to understand the primary use cases for attributes:
32+
To understand this, let's review the primary use cases for attributes:
3133

3234
* Filtering
3335
* Grouping
3436

35-
With the filtering use case, we can use the Trace Analyzer capability of Splunk Observability Cloud to filter on traces that match a particular attribute value. We saw an example of this earlier, when we filtered on traces where the credit score started with "7". Or if a customer called in to complain about slow service, we could use Trace Analyzer to locate all traces with their particular cusotmer number.
37+
### Filtering
38+
39+
With the filtering use case, we can use the **Trace Analyzer** capability of **Splunk Observability Cloud** to filter on traces that match a particular attribute value.
40+
41+
We saw an example of this earlier, when we filtered on traces where the credit score started with "7".
42+
43+
Or if a customer called in to complain about slow service, we could use **Trace Analyzer** to locate all traces with that particular customer number.
44+
45+
Attributes used for filtering use cases are generally high-cardinality, meaning that there could be thousands or even hundreds of thousands of unique values. In fact, **Splunk Observability Cloud** can handle an effectively infinite number of unique attribute values! Filtering using these attributes allows us to rapidly locate the traces of interest.
46+
47+
Note that we aren't required to index tags to use them for filtering with **Trace Analyzer**.
3648

37-
To use Trace Analyzer, it's not necessary to index tags.
49+
### Grouping
3850

39-
On the other hand, with the grouping use case, we can surface trends for attributes that we collect using the powerful Tag Spotlight feature in Splunk Observability Cloud, which we'll see in action shortly.
51+
With the grouping use case, we can surface trends for attributes that we collect using the powerful **Tag Spotlight** feature in **Splunk Observability Cloud**, which we'll see in action shortly.
4052

41-
Applying grouping to our trace data allows us to rapidly surface trends and identify patterns.
53+
Attributes used for grouping use cases should be low to medium-cardinality, with hundreds of unique values.
4254

43-
Attributes used for grouping should be low to medium-cardinality, with hundreds of unique values. For custom attributes to be used with Tag Spotlight, they first need to be indexed.
55+
For custom attributes to be used with **Tag Spotlight**, they first need to be indexed.
4456

45-
So we decided to index the **credit.score.category** tag because it has a few distinct values that would be useful for grouping. In contract, the customer number and credit score tags can have hundreds or thousands of values, and are more valuable for filtering rather than grouping.
57+
We decided to index the `credit.score.category` tag because it has a few distinct values that would be useful for grouping. In contrast, the customer number and credit score tags have hundreds or thousands of unique values, and are more valuable for filtering use cases rather than grouping.
4658

47-
## Troubleshooting vs. Monitoring Metric Sets
59+
## Troubleshooting vs. Monitoring MetricSets
4860

49-
You may have noticed that, to index this tag, we created something called a **Troubleshooting Metric Set**. It's named this was because a Troubleshooting Metric Set, or TMS, allows us to troubleshoot features with this tag using features such as Tag Spotlight, which we'll explore next.
61+
You may have noticed that, to index this tag, we created something called a **Troubleshooting MetricSet**. It's named this was because a Troubleshooting MetricSet, or TMS, allows us to troubleshoot issues with this tag using features such as **Tag Spotlight**.
5062

51-
You may have also noticed that there's another option, which we didn't choose, which is called a **Monitoring Metric Set**. Monitoring Metric Sets go beyond troubleshooting and allow us to use tags for alerts, dashboards, and more. We'll explore this later in the workshop.
63+
You may have also noticed that there's another option which we didn't choose called a **Monitoring MetricSet** (or MMS). Monitoring MetricSets go beyond troubleshooting and allow us to use tags for alerting and dashboards. We'll explore this later in the workshop.
5264

5365

0 commit comments

Comments
 (0)