Skip to content

Allow merges for single dataformat#20894

Merged
mgodwan merged 1 commit intoopensearch-project:feature/datafusionfrom
alchemist51:feature/datafusion
Mar 17, 2026
Merged

Allow merges for single dataformat#20894
mgodwan merged 1 commit intoopensearch-project:feature/datafusionfrom
alchemist51:feature/datafusion

Conversation

@alchemist51
Copy link
Contributor

Description

[Describe what this change achieves]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
📝 TODO sections

🔀 No multiple PR themes
⚡ Recommended focus areas for review

TODO Incomplete

The multi-format merge path uses NoMergePolicy.INSTANCE as a temporary workaround with a TODO comment indicating it needs to be replaced. This means merges are completely disabled for multi-format setups, which could lead to unbounded segment growth and performance degradation in production if this code path is reached.

} else {
    // TODO:: Remove this once the Merge is working for multi format setup
    mergePolicy = new CompositeMergePolicy(NoMergePolicy.INSTANCE, shardId);
}
Missing Case Handling

The inner switch statement for nodeMergePolicy in the DEFAULT_POLICY branch does not have a default case. If a new IndexMergePolicy enum value is added in the future, mergePolicyProvider could remain null, causing the assertion to fire at runtime. Consider adding a default case with an explicit error or fallback.

switch (nodeMergePolicy) {
    case TIERED:
    case DEFAULT_POLICY:
        mergePolicyProvider = tieredMergePolicyProvider;
        break;
    case LOG_BYTE_SIZE:
        mergePolicyProvider = logByteSizeMergePolicyProvider;
        break;
}

@github-actions
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Add default case to prevent null provider

The inner switch on nodeMergePolicy has no default case, so if a new
IndexMergePolicy enum value is added in the future, mergePolicyProvider will remain
null and the subsequent assert will fail at runtime. Adding a default case that
throws an IllegalStateException makes the failure explicit and easier to diagnose.

server/src/main/java/org/opensearch/index/IndexSettings.java [1945-1953]

 switch (nodeMergePolicy) {
     case TIERED:
     case DEFAULT_POLICY:
         mergePolicyProvider = tieredMergePolicyProvider;
         break;
     case LOG_BYTE_SIZE:
         mergePolicyProvider = logByteSizeMergePolicyProvider;
         break;
+    default:
+        throw new IllegalStateException("Unsupported node-scoped merge policy: " + nodeMergePolicy);
 }
Suggestion importance[1-10]: 6

__

Why: Adding a default case with an explicit exception is a valid defensive programming improvement. Without it, adding a new IndexMergePolicy enum value could leave mergePolicyProvider null, causing the assert to fail with a less informative message. However, the outer switch has the same issue and the assert already provides some safety net.

Low
General
Warn when merging is silently disabled

The NoMergePolicy.INSTANCE path is a temporary workaround that silently disables
merging for multi-format setups. This could lead to unbounded segment growth and
degraded query performance in production if the TODO is not addressed. Consider
throwing an explicit exception or logging a prominent warning to prevent accidental
use in production environments.

server/src/main/java/org/opensearch/index/engine/exec/merge/CompositeMergeHandler.java [46-52]

 List<String> secondaryDataFormatNames = indexSettings.getValue(IndexSettings.INDEX_COMPOSITE_SECONDARY_DATA_FORMATS_SETTING);
 if(secondaryDataFormatNames.isEmpty()) {
     mergePolicy = new CompositeMergePolicy(indexSettings.getMergePolicy(true), shardId);
 } else {
     // TODO:: Remove this once the Merge is working for multi format setup
+    logger.warn("Merging is disabled for multi-format composite index setup on shard [{}]. Segments will not be merged.", shardId);
     mergePolicy = new CompositeMergePolicy(NoMergePolicy.INSTANCE, shardId);
 }
Suggestion importance[1-10]: 5

__

Why: Adding a warning log when NoMergePolicy.INSTANCE is used is a reasonable improvement to make the temporary workaround more visible, but it's a minor enhancement since the TODO comment already documents the intent and the code path is intentional.

Low

@github-actions
Copy link
Contributor

❌ Gradle check result for c0d0236: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@mgodwan mgodwan merged commit 2367728 into opensearch-project:feature/datafusion Mar 17, 2026
28 of 52 checks passed
nishchay21 pushed a commit to nishchay21/OpenSearch that referenced this pull request Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants