Commit 25d8845
authored
Avoid counting snapshot failures twice in SLM (elastic#136759) (elastic#136849)
We came across a scenario where 3 snapshot failures were counted as 5
"invocations since last success", resulting in a premature yellow SLM
health indicator. The three snapshot failures completed at virtually the
same time. Our theory is that the listener of the first snapshot failure
already processed the other two snapshot failures (incrementing the
`invocationsSinceLastSuccess`), but the listeners of those other two
snapshots then incremented that field too. There we two warning logs
indicating that the snapshots weren't found in the registered set,
confirming our hypothesis.
We simply avoid incrementing `invocationsSinceLastSuccess` if the
listener failed with an exception and the snapshot isn't registered
anymore; assuming that another listener has already incremented the
field.1 parent 822f5dc commit 25d8845
File tree
3 files changed
+14
-3
lines changed- docs/changelog
- x-pack/plugin/slm/src
- main/java/org/elasticsearch/xpack/slm
- test/java/org/elasticsearch/xpack/slm
3 files changed
+14
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
497 | 497 | | |
498 | 498 | | |
499 | 499 | | |
500 | | - | |
| 500 | + | |
| 501 | + | |
501 | 502 | | |
502 | 503 | | |
503 | 504 | | |
| |||
564 | 565 | | |
565 | 566 | | |
566 | 567 | | |
567 | | - | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
568 | 573 | | |
569 | 574 | | |
570 | 575 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
528 | 528 | | |
529 | 529 | | |
530 | 530 | | |
531 | | - | |
| 531 | + | |
| 532 | + | |
532 | 533 | | |
533 | 534 | | |
534 | 535 | | |
| |||
0 commit comments