Skip to content

Optimized Indices Recovery Integration Tests#20516

Open
ask-kamal-nayan wants to merge 4 commits intoopensearch-project:feature/datafusionfrom
ask-kamal-nayan:newRecoveryITs
Open

Optimized Indices Recovery Integration Tests#20516
ask-kamal-nayan wants to merge 4 commits intoopensearch-project:feature/datafusionfrom
ask-kamal-nayan:newRecoveryITs

Conversation

@ask-kamal-nayan
Copy link

@ask-kamal-nayan ask-kamal-nayan commented Feb 1, 2026

Description

This PR adds comprehensive integration tests for the Optimized Indices recovery scenarios. These tests validate format-aware metadata preservation, CatalogSnapshot recovery, and remote store recovery with Parquet format files.

New Test Files (5 files, 16 tests)

1. DataFusionClusterRecoveryTests.java (2 tests)

Tests cluster-level recovery scenarios:

  • testDataFusionGatewayRecovery - Full cluster restart (gateway) recovery
  • testDataFusionClusterManagerFailover - Cluster manager failover during operations

2. DataFusionSnapshotRestoreRecoveryTests.java (3 tests - currently skipped)

Tests snapshot/restore operations with Parquet format:

  • testDataFusionSnapshotRestore - Basic snapshot and restore
  • testDataFusionRestoreWithForceMerge - Restore after force merge
  • testDataFusionShallowCopySnapshotRestore - Shallow copy snapshot restore

Note: These tests are marked with @AwaitsFix pending implementation completion.

3. DataFusionRecoveryErrorHandlingTests.java (4 tests)

Tests error handling during recovery:

  • testDataFusionRecoveryWithPrimaryRestart - Recovery during primary restart
  • testDataFusionRecoveryWithMultipleReplicaRestarts - Multiple replica restart cycles
  • testDataFusionRecoveryWithAbruptNodeStop - Abrupt node stop during indexing
  • testDataFusionRecoveryStateTracking - Recovery state progression tracking

4. DataFusionRecoveryDataIntegrityTests.java (4 tests)

Tests data integrity during recovery:

  • testDataFusionNoDuplicateSeqNo - Sequence number integrity after replication
  • testDataFusionReplicaCommitsInfosOnRecovery - Replica commits SegmentInfos with CatalogSnapshot
  • testDataFusionReplicaCleansUpOldCommits - Old Parquet generation cleanup
  • testDataFusionSegmentFileConsistency - FileMetadata format consistency

5. DataFusionRecoveryComplexScenariosTests.java (4 tests)

Tests complex/edge case scenarios:

  • testDataFusionRecoveryMultipleIndices - Concurrent recovery of multiple indices
  • testDataFusionRecoveryAllShardsNoRedIndex - Recovery ensuring no red index state
  • testDataFusionRecoveryEmptyIndex - Empty index recovery
  • testDataFusionRecoveryAfterIndexClose - Recovery from remote store after node failure

Key Validations

  • Format-aware metadata preservation (FileMetadata.dataFormat())
  • CatalogSnapshot bytes in RemoteSegmentMetadata
  • Parquet file counts before/after recovery
  • Document count consistency
  • Cluster UUID preservation
  • Remote store segment validation

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 1, 2026

❌ Gradle check result for 35c90e3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ask-kamal-nayan ask-kamal-nayan marked this pull request as ready for review February 2, 2026 05:18
@ask-kamal-nayan ask-kamal-nayan requested a review from a team as a code owner February 2, 2026 05:18
@ask-kamal-nayan ask-kamal-nayan changed the title DataFusion Engine Recovery Integration Tests DataFusion Recovery Integration Tests Feb 2, 2026
@ask-kamal-nayan ask-kamal-nayan changed the title DataFusion Recovery Integration Tests Optimized Indices Recovery Integration Tests Feb 2, 2026
@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stalled Issues that have stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant