You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Avoid counting snapshot failures twice in SLM (#136759) (#136849)
We came across a scenario where 3 snapshot failures were counted as 5
"invocations since last success", resulting in a premature yellow SLM
health indicator. The three snapshot failures completed at virtually the
same time. Our theory is that the listener of the first snapshot failure
already processed the other two snapshot failures (incrementing the
`invocationsSinceLastSuccess`), but the listeners of those other two
snapshots then incremented that field too. There we two warning logs
indicating that the snapshots weren't found in the registered set,
confirming our hypothesis.
We simply avoid incrementing `invocationsSinceLastSuccess` if the
listener failed with an exception and the snapshot isn't registered
anymore; assuming that another listener has already incremented the
field.
0 commit comments