Skip to content

Commit 6eacf4f

Browse files
committed
(Doc+) Slow Logs
1 parent 2c5efd2 commit 6eacf4f

File tree

2 files changed

+167
-77
lines changed

2 files changed

+167
-77
lines changed

docs/reference/how-to/search-speed.asciidoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -567,3 +567,9 @@ also possible to update the client-side logic in order to route queries to the
567567
relevant indices based on filters. However `constant_keyword` makes it
568568
transparently and allows to decouple search requests from the index topology in
569569
exchange of very little overhead.
570+
571+
[discrete]
572+
=== Default search timeout
573+
574+
By default, search requests do not time out. To set a best-effort duration
575+
limit, see setting <<search-timeout,`search.default_search_timeout`>>.
Lines changed: 161 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,68 @@
11
[[index-modules-slowlog]]
22
== Slow Log
33

4+
Slow logs are an industry practice to record database searching and indexing
5+
events with execution durations above specified thresholds. These logs are
6+
helpful for look-back analyses such as investigating
7+
<<task-queue-backlog,historical long-running tasks>>.
8+
9+
Slow logs report task duration on the shard-level for searches and index-level
10+
for indexing, but may not encompass the full task execution time noticed
11+
client-side. For example, slow logs would not surface HTTP network delays or
12+
<<task-queue-backlog,task queues>>.
13+
14+
Slow log events are emitted into <<logging,{es} logging>> under the
15+
`fileset.name` of `slowlog` for threshold qualifying searching and indexing
16+
events. These logs can be viewed respectively for searches and indexing either
17+
18+
* if <<monitoring-overview,{es} monitoring>> is enabled, from
19+
{kibana-ref}/xpack-monitoring.html[Stack Monitoring] under `logger` set to
20+
`index.search.slowlog` and `index.indexing.slowlog`.
21+
22+
* from local {es} service logs directory, under dedicated files suffixing
23+
`_index_search_slowlog.json` and `_index_indexing_slowlog.json`.
24+
25+
[discrete]
26+
[[enable-slow-log]]
27+
=== Enable
28+
29+
Slow logs are <<index-modules,dynamic index settings>> meaning they can be
30+
updated either for all indices under the <<settings,{es} `log4j2.properties`
31+
configuration file>> upon node restart or without restart per index per
32+
index via <<indices-update-settings,update indices settings>>. By default
33+
thresholds are disabled (set to `-1`).
34+
35+
Slow log thresholds can be enabled for the four logging levels: `trace`, `debug`,
36+
`info`, and `warn`. You can mimic setting log level thresholds by disabling
37+
more verbose levels.
38+
39+
To see all live, overrode slow log settings, run
40+
<<indices-get-settings,get index settings API>>:
41+
42+
[source,console]
43+
--------------------------------------------------
44+
GET _all/_settings?expand_wildcards=all&filter_path=*.settings.index.*.slowlog
45+
--------------------------------------------------
46+
47+
If a call was initiated with an `X-Opaque-ID` header, then the ID is included
48+
in Search Slow logs in the **elasticsearch.slowlog.id** field. See
49+
<<x-opaque-id,X-Opaque-Id HTTP header>> for details and best practices.
50+
51+
Enable the respective `index.search.slowlog.include.user` or
52+
`index.indexing.slowlog.include.user` settings to append `user.*` and
53+
`auth.type` fields into slow log for information about the user who triggered
54+
the request.
55+
456
[discrete]
557
[[search-slow-log]]
6-
=== Search Slow Log
58+
==== Searching
759

8-
Shard level slow search log allows to log slow search (query and fetch
9-
phases) into a dedicated log file.
60+
Search slow logs emit per shard. They can be enabled for both the shard's
61+
link:https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch[query
62+
or fetch search phases].
1063

11-
Thresholds can be set for both the query phase of the execution, and
12-
fetch phase, here is a sample:
64+
Here is a sample to set all available options across all indices via the
65+
<<settings,`log4j2.properties` configuration file>>:
1366

1467
[source,yaml]
1568
--------------------------------------------------
@@ -22,10 +75,12 @@ index.search.slowlog.threshold.fetch.warn: 1s
2275
index.search.slowlog.threshold.fetch.info: 800ms
2376
index.search.slowlog.threshold.fetch.debug: 500ms
2477
index.search.slowlog.threshold.fetch.trace: 200ms
78+
79+
index.search.slowlog.include.user: true
2580
--------------------------------------------------
2681

27-
All of the above settings are _dynamic_ and can be set for each index using the
28-
<<indices-update-settings, update indices settings>> API. For example:
82+
Listing the same available options as dynamically modified via
83+
<<indices-update-settings,update indices settings>>:
2984

3085
[source,console]
3186
--------------------------------------------------
@@ -38,47 +93,24 @@ PUT /my-index-000001/_settings
3893
"index.search.slowlog.threshold.fetch.warn": "1s",
3994
"index.search.slowlog.threshold.fetch.info": "800ms",
4095
"index.search.slowlog.threshold.fetch.debug": "500ms",
41-
"index.search.slowlog.threshold.fetch.trace": "200ms"
42-
}
43-
--------------------------------------------------
44-
// TEST[setup:my_index]
45-
46-
By default thresholds are disabled (set to `-1`).
47-
48-
The logging is done on the shard level scope, meaning the execution of a
49-
search request within a specific shard. It does not encompass the whole
50-
search request, which can be broadcast to several shards in order to
51-
execute. Some of the benefits of shard level logging is the association
52-
of the actual execution on the specific machine, compared with request
53-
level.
54-
55-
56-
The search slow log file is configured in the `log4j2.properties` file.
57-
58-
[discrete]
59-
==== Identifying search slow log origin
60-
61-
It is often useful to identify what triggered a slow running query.
62-
To include information about the user that triggered a slow search,
63-
use the `index.search.slowlog.include.user` setting.
64-
65-
[source,console]
66-
--------------------------------------------------
67-
PUT /my-index-000001/_settings
68-
{
96+
"index.search.slowlog.threshold.fetch.trace": "200ms",
6997
"index.search.slowlog.include.user": true
7098
}
7199
--------------------------------------------------
72100
// TEST[setup:my_index]
73101

74-
This will result in user information being included in the slow log.
102+
Here is an example emitted search slow log:
75103

76104
[source,js]
77105
---------------------------
78106
{
79-
"@timestamp": "2024-02-21T12:42:37.255Z",
80-
"log.level": "WARN",
107+
"@timestamp": "2024-12-21T12:42:37.255Z",
81108
"auth.type": "REALM",
109+
"ecs.version": "1.2.0",
110+
"elasticsearch.cluster.name": "distribution_run",
111+
"elasticsearch.cluster.uuid": "Ui23kfF1SHKJwu_hI1iPPQ",
112+
"elasticsearch.node.id": "JK-jn-XpQ3OsDUsq5ZtfGg",
113+
"elasticsearch.node.name": "node-0",
82114
"elasticsearch.slowlog.id": "tomcat-123",
83115
"elasticsearch.slowlog.message": "[index6][0]",
84116
"elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
@@ -88,46 +120,43 @@ This will result in user information being included in the slow log.
88120
"elasticsearch.slowlog.took_millis": 0,
89121
"elasticsearch.slowlog.total_hits": "1 hits",
90122
"elasticsearch.slowlog.total_shards": 1,
91-
"user.name": "elastic",
92-
"user.realm": "reserved",
93-
"ecs.version": "1.2.0",
94-
"service.name": "ES_ECS",
95123
"event.dataset": "elasticsearch.index_search_slowlog",
96-
"process.thread.name": "elasticsearch[runTask-0][search][T#5]",
124+
"fileset.name" : "slowlog",
125+
"log.level": "WARN",
97126
"log.logger": "index.search.slowlog.query",
98-
"elasticsearch.cluster.uuid": "Ui23kfF1SHKJwu_hI1iPPQ",
99-
"elasticsearch.node.id": "JK-jn-XpQ3OsDUsq5ZtfGg",
100-
"elasticsearch.node.name": "node-0",
101-
"elasticsearch.cluster.name": "distribution_run"
127+
"process.thread.name": "elasticsearch[runTask-0][search][T#5]",
128+
"service.name": "ES_ECS",
129+
"user.name": "elastic",
130+
"user.realm": "reserved"
102131
}
103132
104133
---------------------------
105134
// NOTCONSOLE
106135

107-
If a call was initiated with an `X-Opaque-ID` header, then the ID is included
108-
in Search Slow logs in the **elasticsearch.slowlog.id** field. See
109-
<<x-opaque-id, X-Opaque-Id HTTP header>> for details and best practices.
110136

111137
[discrete]
112138
[[index-slow-log]]
113-
=== Index Slow log
139+
==== Indexing
114140

115-
The indexing slow log, similar in functionality to the search slow
116-
log. The log file name ends with `_index_indexing_slowlog.json`. Log and
117-
the thresholds are configured in the same way as the search slowlog.
118-
Index slowlog sample:
141+
Indexing slow logs emit per index document. Here is a sample to set all
142+
available options across all indices via the <<settings,`log4j2.properties`
143+
configuration file>>:
119144

120145
[source,yaml]
121146
--------------------------------------------------
122147
index.indexing.slowlog.threshold.index.warn: 10s
123148
index.indexing.slowlog.threshold.index.info: 5s
124149
index.indexing.slowlog.threshold.index.debug: 2s
125150
index.indexing.slowlog.threshold.index.trace: 500ms
151+
126152
index.indexing.slowlog.source: 1000
153+
index.indexing.slowlog.reformat: true
154+
155+
index.indexing.slowlog.include.user: true
127156
--------------------------------------------------
128157

129-
All of the above settings are _dynamic_ and can be set for each index using the
130-
<<indices-update-settings, update indices settings>> API. For example:
158+
Listing the same available options as dynamically modified via
159+
<<indices-update-settings,update indices settings>>:
131160

132161
[source,console]
133162
--------------------------------------------------
@@ -137,24 +166,14 @@ PUT /my-index-000001/_settings
137166
"index.indexing.slowlog.threshold.index.info": "5s",
138167
"index.indexing.slowlog.threshold.index.debug": "2s",
139168
"index.indexing.slowlog.threshold.index.trace": "500ms",
140-
"index.indexing.slowlog.source": "1000"
141-
}
142-
--------------------------------------------------
143-
// TEST[setup:my_index]
144-
145-
To include information about the user that triggered a slow indexing event,
146-
use the `index.indexing.slowlog.include.user` setting.
147-
148-
[source,console]
149-
--------------------------------------------------
150-
PUT /my-index-000001/_settings
151-
{
169+
"index.indexing.slowlog.source": "1000",
170+
"index.indexing.slowlog.reformat": true,
152171
"index.indexing.slowlog.include.user": true
153172
}
154173
--------------------------------------------------
155174
// TEST[setup:my_index]
156175

157-
By default Elasticsearch will log the first 1000 characters of the _source in
176+
By default {es} will log the first 1000 characters of the `_source` in
158177
the slowlog. You can change that with `index.indexing.slowlog.source`. Setting
159178
it to `false` or `0` will skip logging the source entirely, while setting it to
160179
`true` will log the entire source regardless of size. The original `_source` is
@@ -163,13 +182,78 @@ the original document format is important, you can turn off reformatting by sett
163182
`index.indexing.slowlog.reformat` to `false`, which will cause the source to be
164183
logged "as is" and can potentially span multiple log lines.
165184

166-
The index slow log file is configured in the `log4j2.properties` file.
185+
Here is an example emitted indexing slow log:
186+
187+
[source,js]
188+
---------------------------
189+
{
190+
"@timestamp" : "2024-12-11T22:34:22.613Z",
191+
"auth.type": "REALM",
192+
"ecs.version": "1.2.0",
193+
"elasticsearch.cluster.name" : "41bd111609d849fc9bf9d25b5df9ce96",
194+
"elasticsearch.cluster.uuid" : "BZTn4I9URXSK26imlia0QA",
195+
"elasticsearch.index.id" : "3VfGR7wRRRKmMCEn7Ii58g",
196+
"elasticsearch.index.name": "my-index-000001",
197+
"elasticsearch.node.id" : "GGiBgg21S3eqPDHzQiCMvQ",
198+
"elasticsearch.node.name" : "instance-0000000001",
199+
"elasticsearch.slowlog.id" : "RCHbt5MBT0oSsCOu54AJ",
200+
"elasticsearch.slowlog.source": "{\"key\":\"value\"}"
201+
"elasticsearch.slowlog.took" : "0.01ms",
202+
"event.dataset": "elasticsearch.index_indexing_slowlog",
203+
"fileset.name" : "slowlog",
204+
"log.level" : "TRACE",
205+
"log.logger" : "index.indexing.slowlog.index",
206+
"service.name" : "ES_ECS",
207+
"user.name": "elastic",
208+
"user.realm": "reserved"
209+
}
210+
211+
---------------------------
212+
// NOTCONSOLE
167213

168214
[discrete]
169-
=== Slow log levels
215+
[[troubleshoot-slow-log]]
216+
=== Troubleshooting
217+
218+
Recording slow logs may potentially be resource intensive to your {es} cluster
219+
depending on the qualifying traffic's volume. For example, emitted logs may
220+
increase the index disk usage of your <<monitoring-overview,{es} monitoring>>
221+
cluster. We generally recommend enabling them against specific indices rather
222+
than across all indices, with high thresholds, and usually only during
223+
troubleshooting.
224+
225+
Slow log thresholds being met do not guarantee cluster performance issues may
226+
be experienced. In the event that symptoms are noticed, slow logs can provide
227+
helpful data to diagnosis upstream traffic patterns or source to curb
228+
problematic traffic client-side. For example, from data included in `X-Opaque-ID`,
229+
the `_source` request body, or `user.*` fields. This is similar to
230+
troubleshooting <<task-queue-backlog,live expensive tasks>>. For related feature
231+
optimization information, see <<tune-for-search-speed,tune for search speed>>
232+
and <<tune-for-indexing-speed,tune for indexing speed>.
233+
234+
If you are uncertain where to begin investigating problematic traffic, we
235+
recommend dynamically enabling via <<indices-update-settings,update indices
236+
settings>> with high thresholds against both slow logs:
237+
238+
[source,console]
239+
--------------------------------------------------
240+
PUT _all/_settings
241+
{
242+
"index.indexing.slowlog.include.user": true
243+
"index.indexing.slowlog.threshold.index.warn": "30s",
244+
"index.search.slowlog.include.user": true
245+
"index.search.slowlog.threshold.fetch.warn": "30s",
246+
"index.search.slowlog.threshold.query.warn": "30s",
247+
}
248+
--------------------------------------------------
249+
250+
You may also consider searches flagged for their query durations, preferably
251+
in a non-production environment, using the <<search-profile,profile api>> in
252+
order to investigate optimization options via the
253+
link:{kibana-ref}/xpack-profiler.html[query profiler].
170254

171-
You can mimic the search or indexing slow log level by setting appropriate
172-
threshold making "more verbose" loggers to be switched off.
173-
If for instance we want to simulate `index.indexing.slowlog.level: INFO`
174-
then all we need to do is to set
175-
`index.indexing.slowlog.threshold.index.debug` and `index.indexing.slowlog.threshold.index.trace` to `-1`.
255+
Slow logging checks the event against reporting threshold upon its completion.
256+
Therefore it cannot report if events trigger <<circuit-breaker-errors,circuit
257+
breaker errors>>. If you are troubleshooting this ballpark, you may want to also
258+
consider enabling <<enable-audit-logging,audit logging>> which logs events
259+
before they are executed.

0 commit comments

Comments
 (0)