Skip to content

Commit 3e7f7f4

Browse files
[ML] Adding pytorch oom to known issues (#110668)
* Adding pytorch oom to known issues * Fixing section * Updating text to exclude the pytorch version
1 parent 816cedc commit 3e7f7f4

File tree

9 files changed

+56
-14
lines changed

9 files changed

+56
-14
lines changed

docs/reference/release-notes/8.13.0.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
2828
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
2929
are upgraded.
3030

31+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
32+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
33+
process to reclaim memory. This can cause inference requests to fail.
34+
Elasticsearch will automatically restart the `pytorch_inference` process
35+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
36+
3137
[[breaking-8.13.0]]
3238
[float]
3339
=== Breaking changes
@@ -464,5 +470,3 @@ Search::
464470
* Upgrade to Lucene 9.9.0 {es-pull}102782[#102782]
465471
* Upgrade to Lucene 9.9.1 {es-pull}103387[#103387]
466472
* Upgrade to Lucene 9.9.2 {es-pull}104753[#104753]
467-
468-

docs/reference/release-notes/8.13.1.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
1313
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
1414
are upgraded.
1515

16+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
17+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
18+
process to reclaim memory. This can cause inference requests to fail.
19+
Elasticsearch will automatically restart the `pytorch_inference` process
20+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
21+
1622
[[bug-8.13.1]]
1723
[float]
1824

@@ -45,5 +51,3 @@ Transform::
4551

4652
Transform::
4753
* Raise loglevel of events related to transform lifecycle from DEBUG to INFO {es-pull}106602[#106602]
48-
49-

docs/reference/release-notes/8.13.2.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
1313
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
1414
are upgraded.
1515

16+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
17+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
18+
process to reclaim memory. This can cause inference requests to fail.
19+
Elasticsearch will automatically restart the `pytorch_inference` process
20+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
21+
1622
[[bug-8.13.2]]
1723
[float]
1824

@@ -46,5 +52,3 @@ Packaging::
4652
Security::
4753
* Query API Key Information API support for the `typed_keys` request parameter {es-pull}106873[#106873] (issue: {es-issue}106817[#106817])
4854
* Query API Keys support for both `aggs` and `aggregations` keywords {es-pull}107054[#107054] (issue: {es-issue}106839[#106839])
49-
50-

docs/reference/release-notes/8.13.3.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
2020
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
2121
are upgraded.
2222

23+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
24+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
25+
process to reclaim memory. This can cause inference requests to fail.
26+
Elasticsearch will automatically restart the `pytorch_inference` process
27+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
28+
2329
[[bug-8.13.3]]
2430
[float]
2531
=== Bug fixes
@@ -52,5 +58,3 @@ Search::
5258

5359
ES|QL::
5460
* ESQL: Introduce language versioning to REST API {es-pull}106824[#106824]
55-
56-

docs/reference/release-notes/8.13.4.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
1313
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
1414
are upgraded.
1515

16+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
17+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
18+
process to reclaim memory. This can cause inference requests to fail.
19+
Elasticsearch will automatically restart the `pytorch_inference` process
20+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
21+
1622
[[bug-8.13.4]]
1723
[float]
1824
=== Bug fixes
@@ -28,5 +34,3 @@ Snapshot/Restore::
2834

2935
TSDB::
3036
* Fix tsdb codec when doc-values spread in two blocks {es-pull}108276[#108276]
31-
32-

docs/reference/release-notes/8.14.0.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
2222
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
2323
are upgraded.
2424

25+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
26+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
27+
process to reclaim memory. This can cause inference requests to fail.
28+
Elasticsearch will automatically restart the `pytorch_inference` process
29+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
30+
2531
[[bug-8.14.0]]
2632
[float]
2733
=== Bug fixes
@@ -356,5 +362,3 @@ Network::
356362

357363
Packaging::
358364
* Update bundled JDK to Java 22 (again) {es-pull}108654[#108654]
359-
360-

docs/reference/release-notes/8.14.1.asciidoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
1414
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
1515
are upgraded.
1616

17+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
18+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
19+
process to reclaim memory. This can cause inference requests to fail.
20+
Elasticsearch will automatically restart the `pytorch_inference` process
21+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
22+
1723
[[bug-8.14.1]]
1824
[float]
1925
=== Bug fixes
@@ -42,5 +48,3 @@ Vector Search::
4248

4349
Infra/Settings::
4450
* Add remove index setting command {es-pull}109276[#109276]
45-
46-

docs/reference/release-notes/8.14.2.asciidoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ If your cluster is running on ECK 2.12.1 and above, this may cause problems with
1313
To resolve this issue, perform a rolling restart on the non-master-eligible nodes once all Elasticsearch nodes
1414
are upgraded.
1515

16+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
17+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
18+
process to reclaim memory. This can cause inference requests to fail.
19+
Elasticsearch will automatically restart the `pytorch_inference` process
20+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])
21+
1622
[[bug-8.14.2]]
1723
[float]
1824
=== Bug fixes

docs/reference/release-notes/8.15.0.asciidoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,12 @@ coming[8.15.0]
55

66
Also see <<breaking-changes-8.15,Breaking changes in 8.15>>.
77

8+
[[known-issues-8.15.0]]
9+
[float]
10+
=== Known issues
811

12+
* The `pytorch_inference` process used to run Machine Learning models can consume large amounts of memory.
13+
In environments where the available memory is limited, the OS Out of Memory Killer will kill the `pytorch_inference`
14+
process to reclaim memory. This can cause inference requests to fail.
15+
Elasticsearch will automatically restart the `pytorch_inference` process
16+
after it is killed up to four times in 24 hours. (issue: {es-issue}110530[#110530])

0 commit comments

Comments
 (0)