Parallelize backups and restore file operations#4023
Parallelize backups and restore file operations#4023elangelo wants to merge 9 commits intoapache:mainfrom
Conversation
…to 1 thread, allow overriding with system properties
epugh
left a comment
There was a problem hiding this comment.
I don't have the multi thread chops to approve this, but reaidn through it looks good. I wanted a change to the variable name. Do we need any new tests for this capablity, or do the existing ones cover it well enough?
| * SOLR_BACKUP_MAX_PARALLEL_UPLOADS}. | ||
| */ | ||
| private static final int DEFAULT_MAX_PARALLEL_UPLOADS = | ||
| EnvUtils.getPropertyAsInteger("solr.backup.maxParallelUploads", 1); |
There was a problem hiding this comment.
The pattern we are using now is dot cased, so solr.backup.maxparalleluploads, or maybe if we had mulitple properties solr.backup.parraleluploads.max....
There was a problem hiding this comment.
Good use of EnvUtils, we need them everywhere.
| Backup and restore operations can transfer multiple index files in parallel to improve throughput, especially when using cloud storage repositories like S3 or GCS where latency is higher. | ||
| The parallelism is controlled via system properties or environment variables: | ||
|
|
||
| `solr.backup.maxParallelUploads`:: |
There was a problem hiding this comment.
solr.backup.maxparalleluploads ?
I think the current tests actually cover everything already. Mind that I did change the gcsrepository and s3repository tests to have some parallelism. Unfortunately I was limited to only 2 threads as with more I got an OutOfMemoryException. But I think it still covers what needs covering. |
epugh
left a comment
There was a problem hiding this comment.
LGTM. I'd love another committer who is more comfortable with this code base and especially the multithreaded nature of it to review as well.
… would be such a bottleneck
…was referred to by the non-canonical name `ExecutorUtil.MDCAwareThreadPoolExecutor.CallerRunsPolicy`
|
This PR has had no activity for 60 days and is now labeled as stale. Any new activity will remove the stale label. To attract more reviewers, please tag people who might be familiar with the code area and/or notify the dev@solr.apache.org mailing list. To exempt this PR from being marked as stale, make it a draft PR or add the label "exempt-stale". If left unattended, this PR will be closed after another 60 days of inactivity. Thank you for your contribution! |
There was a problem hiding this comment.
Pull request overview
This PR adds configurable parallelism to Solr’s backup (incremental shard backup) and restore (core restore) file-transfer loops to improve throughput, especially for higher-latency cloud repositories (e.g., S3/GCS).
Changes:
- Add parallel upload/download execution for index file transfers during backup and restore, gated by new sysprop/env settings.
- Document the new parallel transfer settings in the ref guide.
- Update S3/GCS incremental backup tests to enable parallelism and add an unreleased changelog entry.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| solr/solr-ref-guide/modules/deployment-guide/pages/backup-restore.adoc | Documents new parallel upload/download properties and tuning guidance. |
| solr/modules/s3-repository/src/test/org/apache/solr/s3/S3IncrementalBackupTest.java | Enables parallel backup/restore via sysprops for S3 incremental backup tests. |
| solr/modules/gcs-repository/src/test/org/apache/solr/gcs/GCSIncrementalBackupTest.java | Enables parallel backup/restore via sysprops for GCS incremental backup tests. |
| solr/core/src/java/org/apache/solr/handler/RestoreCore.java | Parallelizes restore file copy/download work via an executor and aggregates errors. |
| solr/core/src/java/org/apache/solr/handler/IncrementalShardBackup.java | Parallelizes incremental backup upload work, makes stats thread-safe, aggregates errors. |
| changelog/unreleased/parallelizebackups.yml | Adds changelog entry for the feature. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| authors: | ||
| - name: Samuel Verstraete | ||
| github: elangelo |
There was a problem hiding this comment.
The changelog author metadata uses a github field, but this repository’s changelog format documentation uses nick (optionally with url) under authors. Using an unexpected key may fail changelog validation or omit author info; please switch github: elangelo to nick: elangelo (and add url if desired).
| 60L, | ||
| TimeUnit.SECONDS, | ||
| new SynchronousQueue<>(), | ||
| new SolrNamedThreadFactory("RestoreCore"), | ||
| new ThreadPoolExecutor.CallerRunsPolicy()) |
There was a problem hiding this comment.
Using SynchronousQueue with CallerRunsPolicy means once all maxParallelDownloads threads are busy, additional downloads will execute on the calling thread. That can exceed the configured cap (up to maxParallelDownloads + 1 concurrent transfers) and also bypass the MDCAwareThreadPoolExecutor wrapping for those caller-run tasks. Consider a bounded queue/fixed pool or explicitly limiting in-flight submissions to enforce the configured parallelism.
There was a problem hiding this comment.
The CallerRunsPolicy fallback does mean the submitting thread can run a task when the pool is saturated, but the submitting thread is the Solr request thread — it already carries full MDC context, so there's no MDC loss here. MDCAwareThreadPoolExecutor exists to propagate MDC to new pool threads; the caller-runs case doesn't need that propagation. On the cap concern: the maxParallel* setting is a throughput knob, not a hard safety limit. An occasional N+1 concurrent transfer when the pool is fully busy is negligible for a backup/restore workload.
| Throwable cause = e.getCause(); | ||
| // Unwrap RuntimeExceptions that wrap the original IOException | ||
| if (cause instanceof RuntimeException && cause.getCause() != null) { | ||
| firstError = cause.getCause(); | ||
| } else { |
There was a problem hiding this comment.
In the parallel path, the RuntimeException unwrapping (firstError = cause.getCause()) can discard the wrapper message that contains the filename. Prefer preserving that message (or re-wrapping the cause with filename context) when propagating the first failure.
| 60L, | ||
| TimeUnit.SECONDS, | ||
| new SynchronousQueue<>(), | ||
| new SolrNamedThreadFactory("IncrementalBackup"), | ||
| new ThreadPoolExecutor.CallerRunsPolicy()) |
There was a problem hiding this comment.
With SynchronousQueue + CallerRunsPolicy, once maxParallelUploads threads are busy, additional uploads will run on the submitting thread. That can exceed the configured cap (up to maxParallelUploads + 1 concurrent uploads) and bypass MDCAwareThreadPoolExecutor wrapping for caller-run tasks. Consider a bounded queue/fixed pool or explicitly bounding in-flight submissions to enforce the configured limit.
| List<Future<?>> uploadFutures = new ArrayList<>(); | ||
|
|
||
| try { |
There was a problem hiding this comment.
This implementation queues a Future for every index file and holds them in uploadFutures until the end. For large indexes this can create significant memory overhead and delays error reporting. Consider processing completed tasks as they finish (e.g., ExecutorCompletionService) and/or limiting in-flight submissions to maxParallelUploads.
There was a problem hiding this comment.
this really doesn't hold. we need to wait for all futures anyway, so storing them in a list is what we need to do here
| } catch (RuntimeException e) { | ||
| if (e.getCause() instanceof IOException) { | ||
| throw (IOException) e.getCause(); | ||
| } | ||
| throw e; |
There was a problem hiding this comment.
In the synchronous path, rethrowing only e.getCause() (when it’s an IOException) loses the wrapper message that includes the filename ("Failed to process file: ..."). Preserve that per-file context when propagating errors so backup failures are diagnosable.
| } catch (RuntimeException e) { | ||
| if (e.getCause() instanceof IOException) { | ||
| throw (IOException) e.getCause(); | ||
| } | ||
| throw e; |
There was a problem hiding this comment.
In the synchronous path, unwrapping and rethrowing only e.getCause() drops the wrapper message that includes per-file context (filename). Preserve the wrapper message (or re-wrap the cause including the filename) so restore failures remain actionable.
| List<Future<?>> downloadFutures = new ArrayList<>(); | ||
|
|
There was a problem hiding this comment.
This submits one task per index file and retains every Future in downloadFutures until the end. For large collections with many segment files, that can add substantial memory/GC overhead and delays surfacing failures until all tasks are submitted. Consider processing completions incrementally (e.g., ExecutorCompletionService) and/or bounding the number of in-flight tasks to maxParallelDownloads.
| Throwable cause = e.getCause(); | ||
| // Unwrap RuntimeExceptions that wrap the original IOException | ||
| if (cause instanceof RuntimeException && cause.getCause() != null) { | ||
| firstError = cause.getCause(); | ||
| } else { |
There was a problem hiding this comment.
In the parallel join logic, unwrapping RuntimeException to cause.getCause() can discard the wrapper message that includes the filename. Preserve the wrapper message (or re-wrap the underlying IOException with file context) when surfacing the first failure from future.get().
- Replace unsafe IOException cast with `new IOException(msg, cause)` to preserve the original cause chain in IncrementalShardBackup and RestoreCore - Simplify ExecutionException handling by removing unnecessary RuntimeException unwrapping; directly assign `e.getCause()` as the first error - Fix changelog entry: rename `github` field to `nick` for author metadata
Description
This PR ensures multiple threads are used to create backups and to restore backups. This ensures a considerate speedup when using cloud storage such as S3.
For comparison a backup to s3 of 1.8TiB takes roughly 16 minutes with this code. a 340GiB collection on the old code takes roughly 50 minutes.
Restoring the same collection took 7 minutes instead of 1 hour and 20 minutes (on a 6 node cluster)
Solution
As the previous implementation already had a loop over all files that needed to be backed up to the backup repository I simply wrapped that in a ThreadPool Executor
Tests
I have ran this code locally on a solrcloud cluster
Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.