Adding support for passing an ExecutorService into DirectoryReader.open() to enable concurrent segment reader initialization #15428

BryceKan3 · 2025-11-14T00:57:29Z

Description

Currently, as part of DirectoryReader.open() Lucene will sequentially create segment readers for each segment.

This can be a very slow operation due to the I/O on the SegmentReader creation. By adding support for an ExecutorService to be passed in to DirectoryReader.open() we can submit the segment reader creations into the threadpool and achieve significant performance gains in DirectoryReader.open() times. The implementation is fully backwards compatible and allows for the users to pass in their own executor services.

I have tested the changes and validated the performance improvement that can be possible by utilizing parallelism for the opening of the directory readers.

Optimization	P50 (ms)	P90 (ms)	P99 (ms)	P50 Reduction %
Baseline	995	1020	1041	N/A
Concurrent SegmentReader Initialization	171	178	188	82.81%

Fixes #15387

vigyasharma

Thanks for these changes Bryce. I'm curious about the use-cases where opening segments become a bottleneck that should be parallelized. Have you seen some in production?

vigyasharma · 2025-11-15T19:21:23Z

lucene/core/src/test/org/apache/lucene/index/TestDirectoryReader.java

+    createMultiSegmentIndex(dir, 5);
+
+    ExecutorService executor =
+        Executors.newFixedThreadPool(1, new NamedThreadFactory("TestDirectoryReader"));


Any reason to only test with 1 thread in these tests? Can we add more threads to test for edge cases and races, like only one thread sees an exception while opening a segment reader?

Hey, thanks for the review! Yes we have seen cases where this is a bottleneck. I have added a test case that covers scenarios where a single thread observes the failure.

vigyasharma · 2025-11-15T19:56:46Z

I was wondering if this concurrency would also benefit the openIfChanged calls? But looks like the benefit comes from doing segment reading work in parallel, and for openIfChanged calls, we try to use old readers as much as possible?

Also curious if you tried using virtual threads for the executor in this use-case, and if they helped.

acsant · 2025-11-17T19:30:34Z

lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java

+              // parallelize segment reader initialization
+              futures.add(
+                  (executor)
+                      .submit(


Have you considered using the existing TaskExecutor pattern and instead implementing this as such:

TaskExecutor taskExecutor = new TaskExecutor(executorService); List<Callable<SegmentReader>> tasks = new ArrayList<>(sis.size()); for (int i = 0; i < sis.size(); i++) { final int index = i; tasks.add(() -> new SegmentReader( sis.info(index), sis.getIndexCreatedVersionMajor(), IOContext.DEFAULT)); }

Based on the comments in that class, there are some optimizations we could inherit here:

// try to execute as many tasks as possible on the current thread to minimize context
// switching in case of long running concurrent
// tasks as well as dead-locking if the current thread is part of #executor for executors that
// have limited or no parallelism

Hey yeah I took a look at this, I went with an ExecutorService here directly to ensure we could gracefully close out the readers in the event of an exception. With the TaskExecutor we would get the exception bubbled up and we would not be able to close the SegmentReaders.

mikemccand · 2025-11-18T13:45:52Z

I was wondering if this concurrency would also benefit the openIfChanged calls? But looks like the benefit comes from doing segment reading work in parallel, and for openIfChanged calls, we try to use old readers as much as possible?

+1

But, if there are enough new segments (e.g. it's been a long time since the last openIfChanged), then concurrency might still be helpful.

BryceKan3 · 2025-11-18T17:35:00Z

I was wondering if this concurrency would also benefit the openIfChanged calls? But looks like the benefit comes from doing segment reading work in parallel, and for openIfChanged calls, we try to use old readers as much as possible?

+1

But, if there are enough new segments (e.g. it's been a long time since the last openIfChanged), then concurrency might still be helpful.

Yeah that makes sense, I will add this support to the corresponding openIfChanged APIs so we can get the benefits there as well.

vigyasharma · 2025-11-18T18:39:14Z

Would it make sense to use virtual threads and create a local executor internally, instead of making the callers pass one? Maybe create a helper function that opens and returns a SegmentReader[] array from a provided SegmentInfos...

how about something like –

// pass a null array when there are no old readers?
private static SegmentReader[] createSegmentReaders(SegmentInfos sis, SegmentReader[] oldReaders) {
  final SegmentReader[] readers = new SegmentReader[sis.size()];
  try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<SegmentReader>> futures = new ArrayList<>();
    for (int i = sis.size() - 1; i >= 0; i--) {
      // TODO: add check for cases where old reader should be reused?
      final int index = i;
      futures.add(
        (executor)
        .submit(
          () ->
          new SegmentReader(
            sis.info(index),
            sis.getIndexCreatedVersionMajor(),
            IOContext.DEFAULT)));
    }
    RuntimeException firstException = null;
    for (int i = 0; i < futures.size(); i++) {
      try {
        readers[sis.size() - 1 - i] = futures.get(i).get();
      } catch (ExecutionException | InterruptedException e) {
      // If there is an exception creating the reader we still process
      // the rest of the completed futures to allow us to close created readers
        if (firstException == null) firstException = new RuntimeException(e);
      }
    }
    if (firstException != null) throw firstException;
  } finally {
    return readers;
  }
}

BryceKan3 · 2025-11-18T19:45:22Z

Would it make sense to use virtual threads and create a local executor internally, instead of making the callers pass one? Maybe create a helper function that opens and returns a SegmentReader[] array from a provided SegmentInfos...

how about something like –

// pass a null array when there are no old readers?
private static SegmentReader[] createSegmentReaders(SegmentInfos sis, SegmentReader[] oldReaders) {
  final SegmentReader[] readers = new SegmentReader[sis.size()];
  try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<SegmentReader>> futures = new ArrayList<>();
    for (int i = sis.size() - 1; i >= 0; i--) {
      // TODO: add check for cases where old reader should be reused?
      final int index = i;
      futures.add(
        (executor)
        .submit(
          () ->
          new SegmentReader(
            sis.info(index),
            sis.getIndexCreatedVersionMajor(),
            IOContext.DEFAULT)));
    }
    RuntimeException firstException = null;
    for (int i = 0; i < futures.size(); i++) {
      try {
        readers[sis.size() - 1 - i] = futures.get(i).get();
      } catch (ExecutionException | InterruptedException e) {
      // If there is an exception creating the reader we still process
      // the rest of the completed futures to allow us to close created readers
        if (firstException == null) firstException = new RuntimeException(e);
      }
    }
    if (firstException != null) throw firstException;
  } finally {
    return readers;
  }
}

I just did some testing with passing in Executors.newVirtualThreadPerTaskExecutor() as the executor service and was still able to see the benefits. Upon a deep dive into virtual threads - I have some concerns around implementing them as the default for everything as there are some issues around pinning and mmap that could affect lucene users. With the ability to pass in an executor if the user would like to use virtual threads they can still pass it in otherwise their behavior will remain unchanged - this would allow full backwards compatibility and a new API that can be utilized with virtual threads. Let me know your thoughts @vigyasharma

jainankitk · 2025-11-18T23:48:42Z

Upon a deep dive into virtual threads - I have some concerns around implementing them as the default for everything as there are some issues around pinning and mmap that could affect lucene users.

Yeah there are scenarios with virtual threads around mmap / pinning, that make them less effective for file I/O use cases in Lucene. The same reason we are looking at building better I/O concurrency in OpenSearch - opensearch-project/OpenSearch#18841, else it could have been delegated to the virtual threads in Java!

jainankitk · 2025-11-19T00:06:49Z

lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java

+              readers[i] =
+                  new SegmentReader(
+                      sis.info(i), sis.getIndexCreatedVersionMajor(), IOContext.DEFAULT);
+            }


Right now, we are not using the caller thread to create segment readers. I am wondering if we can/should do that similar to this PR - #13472. Although that makes the code change slightly complex, but might add slightly better concurrency.

That's an interesting optimization! - I would like to keep the scope of this PR to just implementing the base concurrency with the ExecutorService and then we can have future improvements like this one that we can add. Let me know your thoughts!

jainankitk · 2025-11-19T00:07:40Z

lucene/CHANGES.txt


 * GITHUB#15187: Restrict visibility of PerFieldKnnVectorsFormat.FieldsReader (Simon Cooper)

+* GITHUB##15428: Add support for ExecutorServices to be passed into DirectoryReader.open() API (Bryce Kane)


Are we not targeting this change for Lucene 10.4 release? If yes, we should move the entry to 10.4 section.

Will update!

vigyasharma · 2025-11-19T05:33:24Z

Upon a deep dive into virtual threads - I have some concerns around implementing them as the default for everything as there are some issues around pinning and mmap that could affect lucene users.

Yes, I was thinking about that too. It's a frustrating limitation with virtual threads; we can't always know if the underlying IO will use FFM in Lucene and are unable to use them.

Thinking out loud here: a thread getting pinned means it doesn't unmount, which is the same as using a platform thread for the entire task? The only difference is that there is no "pooling" for virtual threads. You create a lot of them (one per task). This is a problem if the executor lives for long and can schedule many virtual threads that get pinned - eating up all the available platform threads. I would assume that in this case, the thread pins only until the reader is opened (which is not long)? And doesn't stay pinned until the reader is eventually closed?!

Anyway, I agree that letting users pass an executor is the safer option here. This is a frequently invoked public API and we don't want to change default behavior in ways that can break for some users.

Maybe we can write it such that it accepts an executor and passes it to the helper function, so that we can use it for both open and openIfChanged ?

uschindler · 2025-11-19T18:39:16Z

Hi,

Would it make sense to use virtual threads and create a local executor internally, instead of making the callers pass one? Maybe create a helper function that opens and returns a SegmentReader[] array from a provided SegmentInfos...

We don't want to create threads in Lucene "automatucally". Of course a virtual thread would not consume resources, but it is not helpful here. Lucene does not hand over threads back to the VM because it does not do "native" IO. It primarily uses MMap and when there are no syscalls, the virtual thraeds won't run in parallel. Please remember; Lucene's CPU usage is bound to CPU and theres no waiting time (e.g., for resources or similar), so virtual threads don't help. You need real threads here, so a Executor is the only way to go.

Uwe

uschindler · 2025-11-19T18:42:29Z

What is the problem with allowing the use to pass an Executor? Lucene should not do anything "aitomagically". If somebody uses NIOFSDir, they can pass virtual threads. Anybody calling Lucene should always have some thread pools laying around, so only use an externally provided thread pool.

vigyasharma · 2025-11-19T18:48:18Z

What is the problem with allowing the use to pass an Executor?

Yes, we aligned above on having users pass an Executor with an externally provided thread pool. It is the right, safe way to make this change.

uschindler · 2025-11-19T19:02:31Z

P.S.: You may still see a small improvement when using virtual threads, but this is because the virtual threads can use the "waiting" time when opening files or listing directory contents. At thos eplace the JVM invokes the callback to the virtual threads management.

BryceKan3 · 2025-11-20T00:08:43Z

@vigyasharma I have pulled the SegmentReader creation logic into a helper function that is used now by both the open and openIfChanged APIs. Additionally I have added support for an ExecutorService to be passed into openIfChanged as well. Let me know your thoughts!

dianjifzm · 2025-11-21T14:14:38Z

Hello everyone,

I've also been looking forward to the parallel initialization capability for SegmentReader.
In fact, I've been anticipating this for several years. It's great to see the discussion sparked by the issues submitted by BryceKan3 today.

Let me explain the application scenario:
Normally, SegmentReader initialization is expected to be fast. However, since Lucene supports StoredFieldsFormat extensions,
we utilized StoredFieldsFormat to compress and store all forward column data in memory.
This move significantly improved IO performance (with about a 30% performance boost),
but it resulted in extremely slow SegmentReader initialization, taking as long as over 10 minutes.
This only affects the startup time and does not impact subsequent segment update mechanisms or other operations, which is why we've been using it for years.

Another suggestion: using Executors.newFixedThreadPool is not ideal.
It would be better to use ForkJoinPool.commonPool(), as the number of threads matches the number of CPUs, fully utilizing CPU performance and avoiding excessive context switching.

From my understanding, there's no need to consider openIfChanged or virtual threads.
SegmentReader only needs to be initialized in parallel during the startup phase. Once initialized, it no longer supports other asynchronous behaviors, which would minimize risks.
The simplest way is to use Arrays.parallelSetAll to initialize the SegmentReader[] readers.

BryceKan3 · 2025-11-21T19:29:06Z

Hey dianjifzm,

Thanks so much for the feedback! Very happy to hear that this feature will be something that you will benefit from as well.

ForkJoinPool is an ExecutorService - with these changes you will be able to directly pass this in the commonPool() as your ExecutorService if you want! Some clients need fine grained threading control and for those they usually manage threadpools themselves, by allowing support to pass in an ExecutorService these users will be able to benefit from this change. As for using Arrays.parallelSetAll - this will utilize the ForkJoinPool, we don't want to do this by default as it can cause unexpected resource contention if the client is managing their own threads.

As far as openIfChanged() - this method does create SegmentReaders and as Mike and Vigya mentioned above there can be scenarios where if we have enough new segments we will be able to benefit from this change as well in that API.

SegmentReader initialization taking 10 minutes seems extremely slow! In my profiling I have found most of the overhead is checksumming files, curious if that is the case for you as well... I wonder if there is some room on the table for some additional optimization in those code paths.

vigyasharma

Looks good @BryceKan3 , thanks for addressing all the suggestions and generalizing it to openIfChanged. I have some minor comments, looks good to ship otherwise.

vigyasharma · 2025-11-21T19:38:21Z

lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java

+   * @return DirectoryReader that covers entire index plus all changes made so far by this
+   *     IndexWriter instance, or null if there are no new changes
+   * @param writer The IndexWriter to open from
+   * @param executorService Provides intra-open concurrency


Minor: "intra-open concurrency" feels a little hard to understand. I have context on this change so I think i know what this means, but let's simplify for the java doc? How about something like:

/** * ... * @param executorService used to open segment readers in parallel */

vigyasharma · 2025-11-21T19:50:31Z

lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java

        SegmentInfos sis =
            SegmentInfos.readCommit(directory, segmentFileName, minSupportedMajorVersion);
-        final SegmentReader[] readers = new SegmentReader[sis.size()];
+        SegmentReader[] readers = new SegmentReader[sis.size()];


Do we need to allocate this array here anymore? looks like createSegmentReader allocates its own array?

… for parallelization

…Is & pulling createSegmentReader logic from open and openIfChanged into common helper function

…to 10.4

BryceKan3 · 2025-11-21T21:35:56Z

Thanks @vigyasharma for the review, I had addressed your comments.

vigyasharma · 2025-11-21T22:04:06Z

Thanks @BryceKan3 ! I tried backporting it for 10.4 but ran into some conflicts. Would you be able to take a look at these conflicts and create another backporting PR against branch_10x ?

BryceKan3 · 2025-11-21T22:10:31Z

Thanks @vigyasharma - so to confirm it is in 11.x currently but needs to be backported into the branch_10x to go out with 10.4?

vigyasharma · 2025-11-21T22:13:45Z

It is merged into the main branch. Needs to additionally be backported to branch_10x to be released with the next 10.x minor version.

…en() to enable concurrent segment reader initialization (apache#15428)

BryceKan3 · 2025-11-22T00:57:36Z

Created #15445 for the backport

github-actions bot added the module:core/index label Nov 14, 2025

github-actions bot added this to the 11.0.0 milestone Nov 14, 2025

vigyasharma reviewed Nov 15, 2025

View reviewed changes

acsant reviewed Nov 17, 2025

View reviewed changes

jainankitk reviewed Nov 19, 2025

View reviewed changes

github-actions bot added the module:replicator label Nov 19, 2025

github-actions bot modified the milestones: 11.0.0, 10.4.0 Nov 19, 2025

vigyasharma reviewed Nov 21, 2025

View reviewed changes

BryceKan3 added 7 commits November 21, 2025 13:18

Adding support for passing executor service into directoryreader.open…

40de473

… for parallelization

Fixing naming

4efa1bf

Fixing formatting, adding to CHANGES.txt

23430e4

Adding test case to cover exceptions occurring on a single thread

c4ce1b6

Adding support for ExecutorService to be passed into openIfChanged AP…

d62b706

…Is & pulling createSegmentReader logic from open and openIfChanged into common helper function

Updating changelog to put concurrent segment reader initialization in…

55884bb

…to 10.4

Fixing formatting

2b04653

Inlining createSegmentReader call / updating javadoc comments

90e88ff

BryceKan3 force-pushed the main branch from 3960a5b to 90e88ff Compare November 21, 2025 21:24

BryceKan3 requested a review from vigyasharma November 21, 2025 21:36

vigyasharma approved these changes Nov 21, 2025

View reviewed changes

vigyasharma merged commit 7d4e224 into apache:main Nov 21, 2025
12 checks passed

BryceKan3 added a commit to BryceKan3/lucene that referenced this pull request Nov 22, 2025

Adding support for passing an ExecutorService into DirectoryReader.op…

2a06fdc

…en() to enable concurrent segment reader initialization (apache#15428)

romseygeek mentioned this pull request Dec 2, 2025

Test failure in TestDirectoryReader.testExecutorServicePartialFailure #15466

Closed


		* GITHUB#15187: Restrict visibility of PerFieldKnnVectorsFormat.FieldsReader (Simon Cooper)

		* GITHUB##15428: Add support for ExecutorServices to be passed into DirectoryReader.open() API (Bryce Kane)

Adding support for passing an ExecutorService into DirectoryReader.open() to enable concurrent segment reader initialization #15428

Adding support for passing an ExecutorService into DirectoryReader.open() to enable concurrent segment reader initialization #15428

Uh oh!

Conversation

BryceKan3 commented Nov 14, 2025

Description

Uh oh!

vigyasharma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vigyasharma commented Nov 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikemccand commented Nov 18, 2025

Uh oh!

BryceKan3 commented Nov 18, 2025

Uh oh!

vigyasharma commented Nov 18, 2025

Uh oh!

BryceKan3 commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jainankitk commented Nov 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vigyasharma commented Nov 19, 2025

Uh oh!

uschindler commented Nov 19, 2025

Uh oh!

uschindler commented Nov 19, 2025

Uh oh!

vigyasharma commented Nov 19, 2025

Uh oh!

uschindler commented Nov 19, 2025

Uh oh!

BryceKan3 commented Nov 20, 2025

Uh oh!

dianjifzm commented Nov 21, 2025

Uh oh!

BryceKan3 commented Nov 21, 2025

Uh oh!

vigyasharma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BryceKan3 commented Nov 21, 2025

Uh oh!

Uh oh!

vigyasharma commented Nov 21, 2025

Uh oh!

BryceKan3 commented Nov 21, 2025

Uh oh!

vigyasharma commented Nov 21, 2025

Uh oh!

BryceKan3 commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

BryceKan3 commented Nov 18, 2025 •

edited

Loading