Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/features/index-management/index-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,7 @@ PUT _plugins/_rollup/jobs/sample_rollup

| Version | PR | Description |
|---------|-----|-------------|
| v3.1.0 | [#1413](https://github.com/opensearch-project/index-management/pull/1413) | Removed unnecessary user notifications for version conflict exception |
| v3.0.0 | [#1198](https://github.com/opensearch-project/index-management/pull/1198) | Adding unfollow action in ISM for CCR |
| v3.0.0 | [#1377](https://github.com/opensearch-project/index-management/pull/1377) | Target Index Settings for rollup |
| v3.0.0 | [#1388](https://github.com/opensearch-project/index-management/pull/1388) | CVE fix: logback-core upgrade |
Expand All @@ -204,8 +205,11 @@ PUT _plugins/_rollup/jobs/sample_rollup
- [Issue #1075](https://github.com/opensearch-project/index-management/issues/1075): ISM listener blocking Cluster Applier thread
- [Issue #1213](https://github.com/opensearch-project/index-management/issues/1213): Feature request for mixed rollup/non-rollup search

- [Issue #1371](https://github.com/opensearch-project/index-management/issues/1371): False positive notifications in Snapshot Management

## Change History

- **v3.1.0** (2026-01-10): Fixed false positive notifications in Snapshot Management by suppressing user notifications for internal VersionConflictEngineException errors
- **v3.0.0** (2025-05-06): Added ISM unfollow action for CCR, rollup target index settings, CVE fixes, Java Agent migration
- **v2.18.0** (2024-11-05): Added `plugins.rollup.search.search_source_indices` setting to allow searching non-rollup and rollup indices together, UX improvements (refresh buttons, section header styling), transform API input validation, fixed snapshot status detection, fixed snapshot policy button reload, fixed data source initialization
- **v2.17.0** (2024-09-17): Performance optimization for skip execution check using cluster service instead of NodesInfoRequest, security integration test fixes
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Notifications Improvements

## Summary

This bugfix eliminates false positive notifications in Snapshot Management by suppressing user notifications for internal `VersionConflictEngineException` errors that occur during concurrent snapshot operations. Users no longer receive misleading error notifications for race conditions that don't affect actual snapshot functionality.

## Details

### What's New in v3.1.0

The Snapshot Management state machine now handles `VersionConflictEngineException` gracefully by logging the error instead of throwing an exception that triggers user notifications.

### Technical Changes

#### Problem Background

When a manual snapshot policy runs, it creates and deletes snapshots based on configured cron jobs. These operations update state in the `.ism-config` system index. Due to race conditions during concurrent operations:

1. A snapshot deletion starts and holds a lock on the system index
2. Another snapshot creation begins while the deletion is in progress
3. When the deletion completes, it fails to update metadata due to version conflict
4. Previously, this triggered a notification to users despite the snapshot operation succeeding

```mermaid
sequenceDiagram
participant User
participant SM as Snapshot Management
participant ISM as .ism-config Index
participant Notif as Notifications

SM->>ISM: Start snapshot deletion (seqNo=100)
SM->>ISM: Start snapshot creation (seqNo=100)
ISM-->>SM: Creation succeeds (seqNo=101)
ISM-->>SM: Deletion metadata update fails (VersionConflict)
Note over SM: Before: Exception thrown
SM->>Notif: Send error notification
Notif->>User: False alarm notification

Note over SM: After v3.1.0: Exception handled
SM->>SM: Log error, continue
Note over User: No false alarm
```

#### Code Changes

The fix modifies `SMStateMachine.kt` to catch and handle `VersionConflictEngineException` separately:

```kotlin
} catch (ex: Exception) {
val unwrappedException = ExceptionsHelper.unwrapCause(ex) as Exception
if (unwrappedException is VersionConflictEngineException) {
// Don't throw the exception
log.error("Version conflict exception while updating metadata.", ex)
return
}
// Other exceptions still trigger notifications
val smEx = SnapshotManagementException(ExceptionKey.METADATA_INDEXING_FAILURE, ex)
log.error(smEx.message, ex)
throw smEx
}
```

#### Changed Files

| File | Changes |
|------|---------|
| `SMStateMachine.kt` | Added exception handling for `VersionConflictEngineException` |
| `SMStateMachineTests.kt` | Added unit tests for graceful handling and exception propagation |

### Migration Notes

No migration required. This is a transparent bugfix that improves notification accuracy.

## Limitations

- The underlying race condition still exists; only the notification behavior is changed
- A future enhancement (noted in TODO) could extract seqNo from the exception and retry the metadata update

## Related PRs

| PR | Description |
|----|-------------|
| [#1413](https://github.com/opensearch-project/index-management/pull/1413) | Removed unnecessary user notifications for version conflict exception |

## References

- [Issue #1371](https://github.com/opensearch-project/index-management/issues/1371): Eliminate False Positive Notifications in Manual Snapshot Policy
- [Snapshot Management Documentation](https://docs.opensearch.org/3.0/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-management/)

## Related Feature Report

- [Index Management](../../../features/index-management/index-management.md)
4 changes: 4 additions & 0 deletions docs/releases/v3.1.0/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@

- [Flow Framework Dependencies](features/flow-framework/flow-framework-dependencies.md) - Conditional DynamoDB client dependency and data summary with log pattern agent template

### Index Management

- [Notifications Improvements](features/index-management/notifications-improvements.md) - Fix false positive notifications in Snapshot Management for version conflict exceptions

### Dashboards Assistant

- [Dashboard Assistant CI Fixes](features/dashboards-assistant/dashboard-assistant-ci-fixes.md) - Fix CI failures due to path alias babel configuration changes
Expand Down