Add sandbox plugin for composite indexing execution engine#20909
Add sandbox plugin for composite indexing execution engine#20909alchemist51 wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit ea0817e)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to ea0817e Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 1839651
Suggestions up to commit 2823348
Suggestions up to commit e68f39f
Suggestions up to commit 8425a85Suggestions up to commit b9627f1
|
|
❌ Gradle check result for 392e1fd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit b3f4e8a |
|
❌ Gradle check result for b3f4e8a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit ae4560a |
|
Persistent review updated to latest commit 03e124c |
|
❌ Gradle check result for 03e124c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit 82ffb04 |
|
Persistent review updated to latest commit dc8d029 |
|
❌ Gradle check result for dc8d029: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit 6687b5d |
|
❌ Gradle check result for 6687b5d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 238416a.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit 238416a |
|
Persistent review updated to latest commit 61cd8e4 |
|
❌ Gradle check result for 61cd8e4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit 2e5d2fa |
|
Persistent review updated to latest commit f685976 |
|
Persistent review updated to latest commit 5214646 |
|
Persistent review updated to latest commit 166e609 |
|
Persistent review updated to latest commit 2823348 |
|
❌ Gradle check result for 2823348: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit 1839651 |
|
❌ Gradle check result for 1839651: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit d4d2e6f.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
Signed-off-by: Arpit Bandejiya <abandeji@amazon.com> Co-authored-by: Bukhtawar Khan <bukhtawa@amazon.com>
|
Persistent review updated to latest commit ea0817e |
|
❌ Gradle check result for ea0817e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This PR introduces the
composite-enginesandbox plugin that implements theCompositeIndexingExecutionEngine— the orchestration layer for multi-format indexing as described in RFC #20644The composite engine enables an index to write documents to multiple storage formats (e.g., Lucene + Parquet) simultaneously through a single
IndexingExecutionEngineinterface. Format plugins register via theExtensiblePluginSPI, and the composite engine delegates writes, refresh, and file management to each per-format engine.Note: we have used ExtensiblePlugin SPI model temporarily. Once we have introduced Dataformat Registry, we should be able to get rid of this model.
What's included
New sandbox plugin:
sandbox/plugins/composite-engineCompositeEnginePlugin—ExtensiblePluginentry point that discoversDataFormatPluginimplementations at node bootstrap, validates index settings, and creates the composite engine. Registers three index settings:index.composite.enabled(defaultfalse)index.composite.primary_data_format(default"lucene")index.composite.secondary_data_formats(default[])CompositeIndexingExecutionEngine— Orchestrates indexing across a primary and zero or more secondary per-format engines. Handles writer creation, refresh (flush all writers → build segments → delegate per-format refresh), file deletion, and document input creation.CompositeDataFormat— ADataFormatwrapper over the constituent formats. UsesLong.MIN_VALUEpriority so concrete formats take precedence.CompositeDocumentInput— BroadcastsaddField,setRowId, and other metadata operations to all per-formatDocumentInputinstances. Releases the writer back to the pool onclose().CompositeWriter— DelegatesaddDoc,flush,sync, andcloseto each per-format writer (primary first, then secondaries). ImplementsLockfor pool checkout semantics.CompositeDataFormatWriterPool— Thread-safe pool ofCompositeWriterinstances with lock-based checkout/release and acheckoutAllfor flush.RowIdGenerator— Generates monotonically increasing row IDs for cross-format document synchronization within a writer's segment scope.New sandbox lib:
sandbox/libs/composite-engine-libConcurrentQueue— Striped concurrent queue using thread-affinity hashing to reduce contention across concurrent indexing threads.LockableConcurrentQueue— ExtendsConcurrentQueuewithtryLock-based polling so writers can be checked out without blocking.How format plugins integrate
Format plugins (e.g., Parquet) extend this plugin by:
extendedPlugins = ['composite-engine']in theirbuild.gradleDataFormatPluginExtensiblePluginSPI discovers them automatically during node bootstrapRelated issues
Resolves part of #20876
Check List
@ExperimentalApi