Skip to content

Commit a0e44d3

Browse files
committed
add find latency and failure correlations
1 parent 6e6eb3c commit a0e44d3

File tree

2 files changed

+71
-9
lines changed

2 files changed

+71
-9
lines changed

raw-migrated-files/observability-docs/observability/apm-lambda.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Cold start is also displayed in the trace waterfall, where you can drill-down in
3131

3232
### Latency distribution correlation [apm-lambda-cold-start-latency]
3333

34-
The [latency correlations](../../../solutions/observability/apps/find-transaction-latency-failure-correlations.md#correlations-latency) feature can be used to visualize the impact of Lambda cold starts on latency—​just select the `faas.coldstart` field.
34+
The [latency correlations](../../../solutions/observability/apps/find-transaction-latency-failure-correlations.md#observability-apm-find-transaction-latency-and-failure-correlations) feature can be used to visualize the impact of Lambda cold starts on latency—​just select the `faas.coldstart` field.
3535

3636
:::{image} ../../../images/observability-lambda-correlations.png
3737
:alt: lambda correlations example

solutions/observability/apps/find-transaction-latency-failure-correlations.md

Lines changed: 70 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,79 @@ mapped_urls:
44
- https://www.elastic.co/guide/en/serverless/current/observability-apm-find-transaction-latency-and-failure-correlations.html
55
---
66

7-
# Find transaction latency and failure correlations
87

9-
% What needs to be done: Align serverless/stateful
8+
# Find transaction latency and failure correlations [observability-apm-find-transaction-latency-and-failure-correlations]
109

11-
% Use migrated content from existing pages that map to this page:
10+
Correlations surface attributes of your data that are potentially correlated with high-latency or erroneous transactions. For example, if you are a site reliability engineer who is responsible for keeping production systems up and running, you want to understand what is causing slow transactions. Identifying attributes that are responsible for higher latency transactions can potentially point you toward the root cause. You may find a correlation with a particular piece of hardware, like a host or pod. Or, perhaps a set of users, based on IP address or region, is facing increased latency due to local data center issues.
1211

13-
% - [ ] ./raw-migrated-files/observability-docs/observability/apm-correlations.md
14-
% - [ ] ./raw-migrated-files/docs-content/serverless/observability-apm-find-transaction-latency-and-failure-correlations.md
12+
To find correlations:
1513

16-
% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):
14+
::::{tab-set}
15+
:group: stack-serverless
1716

18-
$$$correlations-latency$$$
17+
:::{tab-item} Elastic Stack v9
18+
:sync: stack
1919

20-
$$$observability-apm-find-transaction-latency-and-failure-correlations-find-high-transaction-latency-correlations$$$
20+
Select a service on the **Services** page in the Applications UI then select a transaction group from the **Transactions** tab.
21+
22+
::::{note}
23+
Queries within the Applications UI are also applied to the correlations.
24+
::::
25+
26+
:::
27+
28+
:::{tab-item} Serverless
29+
:sync: serverless
30+
31+
1. In your {{obs-serverless}} project, go to **Applications****Service Inventory**.
32+
2. Select a service.
33+
3. Select the **Transactions** tab.
34+
4. Select a transaction group in the **Transactions** table.
35+
36+
::::{note}
37+
Active queries *are* applied to correlations.
38+
39+
::::
40+
41+
:::
42+
43+
::::
44+
45+
46+
## Find high transaction latency correlations [observability-apm-find-transaction-latency-and-failure-correlations-find-high-transaction-latency-correlations]
47+
48+
The correlations on the **Latency correlations** tab help you discover which attributes are contributing to increased transaction latency.
49+
50+
:::{image} ../../../images/observability-correlations-hover.png
51+
:alt: APM latency correlations
52+
:class: screenshot
53+
:::
54+
55+
The progress bar indicates the status of the asynchronous analysis, which performs statistical searches across a large number of attributes. For large time ranges and services with high transaction throughput, this might take some time. To improve performance, reduce the time range.
56+
57+
The latency distribution chart visualizes the overall latency of the transactions in the transaction group. If there are attributes that have a statistically significant correlation with slow response times, they are listed in a table below the chart. The table is sorted by correlation coefficients that range from 0 to 1. Attributes with higher correlation values are more likely to contribute to high latency transactions. By default, the attribute with the highest correlation value is added to the chart. To see the latency distribution for other attributes, select their row in the table.
58+
59+
If a correlated attribute seems noteworthy, use the **Filter** quick links:
60+
61+
* `+` creates a new query in the Applications UI for filtering transactions containing the selected value.
62+
* `-` creates a new query in the Applications UI to filter out transactions containing the selected value.
63+
64+
You can also click the icon beside the field name to view and filter its most popular values.
65+
66+
In this example screenshot, there are transactions that are skewed to the right with slower response times than the overall latency distribution. If you select the `+` filter in the appropriate row of the table, it creates a new query in the Applications UI for transactions with this attribute. With the "noise" now filtered out, you can begin viewing sample traces to continue your investigation.
67+
68+
69+
## Find failed transaction correlations [correlations-error-rate]
70+
71+
The correlations on the **Failed transaction correlations** tab help you discover which attributes are most influential in distinguishing between transaction failures and successes. In this context, the success or failure of a transaction is determined by its [event.outcome](https://www.elastic.co/guide/en/ecs/current/ecs-event.html#field-event-outcome) value. For example, APM agents set the `event.outcome` to `failure` when an HTTP transaction returns a `5xx` status code.
72+
73+
The chart highlights the failed transactions in the overall latency distribution for the transaction group. If there are attributes that have a statistically significant correlation with failed transactions, they are listed in a table. The table is sorted by scores, which are mapped to high, medium, or low impact levels. Attributes with high impact levels are more likely to contribute to failed transactions. By default, the attribute with the highest score is added to the chart. To see a different attribute in the chart, select its row in the table.
74+
75+
For example, in the screenshot below, there are attributes such as a specific node and pod name that have medium impact on the failed transactions.
76+
77+
:::{image} ../../../images/observability-correlations-failed-transactions.png
78+
:alt: Failed transaction correlations
79+
:class: screenshot
80+
:::
81+
82+
Select the `+` filter to create a new query in the Applications UI for transactions with one or more of these attributes. If you are unfamiliar with a field, click the icon beside its name to view its most popular values and optionally filter on those values too. Each time that you add another attribute, it is filtering out more and more noise and bringing you closer to a diagnosis.

0 commit comments

Comments
 (0)