Skip to content

Commit f3f5615

Browse files
authored
docs: add notifications-improvements report for v3.1.0 (#994)
1 parent 7199e9f commit f3f5615

File tree

3 files changed

+100
-0
lines changed

3 files changed

+100
-0
lines changed

docs/features/index-management/index-management.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,7 @@ PUT _plugins/_rollup/jobs/sample_rollup
180180

181181
| Version | PR | Description |
182182
|---------|-----|-------------|
183+
| v3.1.0 | [#1413](https://github.com/opensearch-project/index-management/pull/1413) | Removed unnecessary user notifications for version conflict exception |
183184
| v3.0.0 | [#1198](https://github.com/opensearch-project/index-management/pull/1198) | Adding unfollow action in ISM for CCR |
184185
| v3.0.0 | [#1377](https://github.com/opensearch-project/index-management/pull/1377) | Target Index Settings for rollup |
185186
| v3.0.0 | [#1388](https://github.com/opensearch-project/index-management/pull/1388) | CVE fix: logback-core upgrade |
@@ -204,8 +205,11 @@ PUT _plugins/_rollup/jobs/sample_rollup
204205
- [Issue #1075](https://github.com/opensearch-project/index-management/issues/1075): ISM listener blocking Cluster Applier thread
205206
- [Issue #1213](https://github.com/opensearch-project/index-management/issues/1213): Feature request for mixed rollup/non-rollup search
206207

208+
- [Issue #1371](https://github.com/opensearch-project/index-management/issues/1371): False positive notifications in Snapshot Management
209+
207210
## Change History
208211

212+
- **v3.1.0** (2026-01-10): Fixed false positive notifications in Snapshot Management by suppressing user notifications for internal VersionConflictEngineException errors
209213
- **v3.0.0** (2025-05-06): Added ISM unfollow action for CCR, rollup target index settings, CVE fixes, Java Agent migration
210214
- **v2.18.0** (2024-11-05): Added `plugins.rollup.search.search_source_indices` setting to allow searching non-rollup and rollup indices together, UX improvements (refresh buttons, section header styling), transform API input validation, fixed snapshot status detection, fixed snapshot policy button reload, fixed data source initialization
211215
- **v2.17.0** (2024-09-17): Performance optimization for skip execution check using cluster service instead of NodesInfoRequest, security integration test fixes
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Notifications Improvements
2+
3+
## Summary
4+
5+
This bugfix eliminates false positive notifications in Snapshot Management by suppressing user notifications for internal `VersionConflictEngineException` errors that occur during concurrent snapshot operations. Users no longer receive misleading error notifications for race conditions that don't affect actual snapshot functionality.
6+
7+
## Details
8+
9+
### What's New in v3.1.0
10+
11+
The Snapshot Management state machine now handles `VersionConflictEngineException` gracefully by logging the error instead of throwing an exception that triggers user notifications.
12+
13+
### Technical Changes
14+
15+
#### Problem Background
16+
17+
When a manual snapshot policy runs, it creates and deletes snapshots based on configured cron jobs. These operations update state in the `.ism-config` system index. Due to race conditions during concurrent operations:
18+
19+
1. A snapshot deletion starts and holds a lock on the system index
20+
2. Another snapshot creation begins while the deletion is in progress
21+
3. When the deletion completes, it fails to update metadata due to version conflict
22+
4. Previously, this triggered a notification to users despite the snapshot operation succeeding
23+
24+
```mermaid
25+
sequenceDiagram
26+
participant User
27+
participant SM as Snapshot Management
28+
participant ISM as .ism-config Index
29+
participant Notif as Notifications
30+
31+
SM->>ISM: Start snapshot deletion (seqNo=100)
32+
SM->>ISM: Start snapshot creation (seqNo=100)
33+
ISM-->>SM: Creation succeeds (seqNo=101)
34+
ISM-->>SM: Deletion metadata update fails (VersionConflict)
35+
Note over SM: Before: Exception thrown
36+
SM->>Notif: Send error notification
37+
Notif->>User: False alarm notification
38+
39+
Note over SM: After v3.1.0: Exception handled
40+
SM->>SM: Log error, continue
41+
Note over User: No false alarm
42+
```
43+
44+
#### Code Changes
45+
46+
The fix modifies `SMStateMachine.kt` to catch and handle `VersionConflictEngineException` separately:
47+
48+
```kotlin
49+
} catch (ex: Exception) {
50+
val unwrappedException = ExceptionsHelper.unwrapCause(ex) as Exception
51+
if (unwrappedException is VersionConflictEngineException) {
52+
// Don't throw the exception
53+
log.error("Version conflict exception while updating metadata.", ex)
54+
return
55+
}
56+
// Other exceptions still trigger notifications
57+
val smEx = SnapshotManagementException(ExceptionKey.METADATA_INDEXING_FAILURE, ex)
58+
log.error(smEx.message, ex)
59+
throw smEx
60+
}
61+
```
62+
63+
#### Changed Files
64+
65+
| File | Changes |
66+
|------|---------|
67+
| `SMStateMachine.kt` | Added exception handling for `VersionConflictEngineException` |
68+
| `SMStateMachineTests.kt` | Added unit tests for graceful handling and exception propagation |
69+
70+
### Migration Notes
71+
72+
No migration required. This is a transparent bugfix that improves notification accuracy.
73+
74+
## Limitations
75+
76+
- The underlying race condition still exists; only the notification behavior is changed
77+
- A future enhancement (noted in TODO) could extract seqNo from the exception and retry the metadata update
78+
79+
## Related PRs
80+
81+
| PR | Description |
82+
|----|-------------|
83+
| [#1413](https://github.com/opensearch-project/index-management/pull/1413) | Removed unnecessary user notifications for version conflict exception |
84+
85+
## References
86+
87+
- [Issue #1371](https://github.com/opensearch-project/index-management/issues/1371): Eliminate False Positive Notifications in Manual Snapshot Policy
88+
- [Snapshot Management Documentation](https://docs.opensearch.org/3.0/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-management/)
89+
90+
## Related Feature Report
91+
92+
- [Index Management](../../../features/index-management/index-management.md)

docs/releases/v3.1.0/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,10 @@
9797

9898
- [Flow Framework Dependencies](features/flow-framework/flow-framework-dependencies.md) - Conditional DynamoDB client dependency and data summary with log pattern agent template
9999

100+
### Index Management
101+
102+
- [Notifications Improvements](features/index-management/notifications-improvements.md) - Fix false positive notifications in Snapshot Management for version conflict exceptions
103+
100104
### Dashboards Assistant
101105

102106
- [Dashboard Assistant CI Fixes](features/dashboards-assistant/dashboard-assistant-ci-fixes.md) - Fix CI failures due to path alias babel configuration changes

0 commit comments

Comments
 (0)