Elasticsearch: Async handling of indexing/deletion requests by tobias-hotz · Pull Request #8465 · geonetwork/core-geonetwork

tobias-hotz · 2024-10-24T13:38:40Z

Currently, indexing is batched as a global queue of 200 elements (except when forceRefreshReaders is true). When the threshold is reached, the entries are submitted, and the thread submitting the 200th element waits until the elasticsearch returns the result of the request.
With deletion, we currently always send one request per deletion request and always use the deleteByQuery method.

The current design has a number of flaws:

When multiple threads index metadata and both do not use forceRefreshReaders, the queue could grow past 200 (e.g., two threads add at about the same time, and at the time the queue size is checked, the queue already has 201 elements). This causes the queue to grow indefinitely, meaning no further entries are submitted before an explicit call to sendDocumentsToIndex. This can cause Out Of Memory errors (we observed this when using multiple reindexing threads)
An indexing thread is spending a significant amount of time waiting for the elasticsearch, while it could already prepare the next batch for the ES to be consumed
As we wait after every delete call, deleting many entries takes a lot more time as needed
Due to deletion always using deleteByQuery (even when we could delete by uuid), no batching is possible
When multiple threads index metadata and both do not use forceRefreshReaders, the queue contains mixed entries of Thread 1 and Thread 2. At most call sites, after all entries have been indexed, sendDocumentsToIndex is called. It is currently possible that Thread 1 is currently sending a batch that contains entries from Thread 2 as well and Thread 2 already finished submitting the rest of the batch, causing Thread 2 to think that all its entries have been sent to Elasticsearch, but they are still being submitted by Thread 1. This is a very rare problem, though

This PR solves all of these problems. The main takeaway is that it significantly improves the performance of deleting and indexing many entries.

This is accomplished by introducing a IIndexSubmitter and IDeletionSubmitter. These new classes handle how new entries are sent to the index. The direct implementations (DirectIndexSubmitter and DirectDeletionSubmitter) are similar to how the old forceIndexChanges parameter worked in that they directly send the data to the index.
With the use of the BatchingIndexSubmitter and BatchingDeletionSubmittor, chunks are sent periodically to the elasticsearch (just as before), but a local queue is used, and we do not wait for the elasticearch. The index responses are handled asynchronously on a different thread instead. We still guarantee that the indexing will be complete once the whole block is done, as the close method sends the rest of the local queue and waits for all async responses to be complete.

We made some performance measurements on a smaller scale. Here is the average result of a bunch of runs with different CSW harvesters:

Operation	Average time before change	Average time after change	Improvement in %
Harvesting ~3600 entries	11 minutes and 17 seconds	06 minutes and 30 seconds	~74%
Harvesting ~1250 entries	06 minutes and 33 seconds	04 minutes and 28 seconds	~47%
Harvesting ~700 entries	03 minutes and 14 seconds	01 minutes and 30 seconds	~131%
Harvesting ~100 entries	00 minutes and 29 seconds	00 minutes and 29 seconds	~35%
Reindexing ~5700 entries	04 minutes and 39 seconds	02 minutes and 40 seconds	~74%
Deleting ~3600 entries	01 minutes and 35 seconds	00 minutes and 20 seconds	~475%
Deleting ~1250 entries	00 minutes and 38 seconds	00 minutes and 09 seconds	~433%
Deleting ~700 entries	00 minutes and 20 seconds	00 minutes and 04 seconds	~500%
Deleting ~100 entries	00 minutes and 05 seconds	00 minutes and 01 seconds	~500%

As you can see, there are very significant performance gains. These numbers were recorded on a local machine, if you use a remote index on a different machine, the effect may be even higher due to latency/throughput limitations.

Checklist

Funded by LGL BW

This improves indexing time, especially when the ES connection has a high latency or the ES load is high.

…dex entries

…in case on of not properly closed submittors

This causes issues with some tests. This issue was already present before the changes

fxprunayre · 2024-11-14T12:11:09Z

Interesting work @tobias-hotz. We are also investigating how to improve indexing performances for GeoNetwork 5. See draft work geonetwork/geonetwork#19 and
https://github.com/geonetwork/geonetwork/blob/main/src/modules/indexing/src/main/java/org/geonetwork/indexing/IndexingService.java#L108
Maybe some ideas can be shared, or tracking some GN4 hot spots to avoid to make similar mistakes in GN5 can be nice.

tobias-hotz · 2024-11-14T12:49:24Z

Hi @fxprunayre
thanks for taking a look.
The idea for this is to improve performance of indexing while breaking none of the existing behaviour. Some code paths assume that an element is available in the index right after the call to index, so this has to be taken into account.
Also, the index preparation is still done in the main thread, as it is not unlikely that some code relies on this, and given how big the GN4 Codebase is, I'd rather go with this.

My first approach was to just return the Future of the index response to the caller, but that was getting pretty messy and it was easy to miss a call site. That's why I chose this approach.
For GN5, it could also be benificial to move the index preparation stuff to another thread.

This change allows allows the multithreaded reindexing to work again (which is somewhat broken at the moment, mainly because of concurrency issues with the single document buffer for the bulk requests). This reduces the time spend on reindexing by a lot. So support for multithreaded indexing is something GN5 should also provide out of the box.

…ders

CLAassistant · 2024-12-08T03:42:21Z

All committers have signed the CLA.

…indexing # Conflicts: # services/src/main/java/org/fao/geonet/api/processing/DatabaseProcessUtils.java

…ectDeletionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…ectIndexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…letionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…dexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…ch/BatchingDeletionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…ch/BatchingIndexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

…ch/BatchingSubmitterBase.java Co-authored-by: Jose García <josegar74@gmail.com>

…ch/StateBase.java Co-authored-by: Jose García <josegar74@gmail.com>

fxprunayre · 2025-06-18T06:26:22Z

FYI, deployed on test env on some projects here. So far, all is working fine.

# Conflicts: # core/src/main/java/org/fao/geonet/kernel/datamanager/IMetadataIndexer.java # core/src/main/java/org/fao/geonet/kernel/datamanager/base/BaseMetadataManager.java # core/src/main/java/org/fao/geonet/kernel/search/index/BatchOpsMetadataReindexer.java # core/src/test/java/org/fao/geonet/AbstractCoreIntegrationTest.java

tobias-hotz · 2025-09-03T09:45:23Z

I've rebased this PR onto the latest main branch once more.
Can someone please do a full review so this has a real chance to get in the next release? We've already got some positive test results from our company and @fxprunayre and promising performance numbers.

fxprunayre · 2025-09-10T06:35:46Z

Thanks for the additional rebase @tobias-hotz. Indeed the changes looks to work well, no issue reported on indexing on my side. Discussing with @josegar74 yesterday, we propose to make 4.4.9 this month (or early October), and merge that PR just after so that everyone can test it for the 4.4.10 release.

# Conflicts: # core/src/test/java/org/fao/geonet/AbstractCoreIntegrationTest.java # harvesters/src/main/java/org/fao/geonet/kernel/harvest/harvester/geonet/BaseGeoNetworkAligner.java

# Conflicts: # harvesters/src/main/java/org/fao/geonet/kernel/harvest/harvester/AbstractHarvester.java # plugins/datahub-integration/geonetwork-ui

tobias-hotz · 2025-11-14T13:01:41Z

Any news now that 4.4.10 is out? I've rebased this twice now since the release of that version

tobias-hotz · 2026-01-05T11:15:32Z

Hi @josegar74 and @fxprunayre,
can you give me a status update on this PR? I really don't want to be pushy, but it would really suck if we miss the next release cycle for this PR again.
If there is anything we can do to help get this merged, feel free to ping me.

fxprunayre · 2026-01-15T16:13:19Z

Hi @tobias-hotz, as reported before, no issue reported on this so far on my side so it sounds good to go but it would be good to have others opinion ...

tobias-hotz · 2026-03-26T12:17:37Z

@josegar74 Do you (or another reviewer) have any plans to tackle this? If yes, then I will continue rebasing this, but if the consensus is that this change is too risky or big to review or something else, then I don't need to keep updating the patch

fxprunayre · 2026-03-27T10:29:46Z

Sorry @tobias-hotz for being so slow reviewing PRs (definitely an area to improve).

Discussing with @josegar74 this morning about this work, we propose to make 4.4.10 first (date is still uncertain but should happen in April). It will contain Elasticsearch version update (cf. #9176). Wait for the release to fix conflicts one more time so that we merge it early for 4.4.11.

Is that fine for everyone?

Handle async responses asynchronously.

d3818ea

This improves indexing time, especially when the ES connection has a high latency or the ES load is high.

tobias-hotz marked this pull request as draft October 24, 2024 13:51

tobias-hotz added 8 commits October 30, 2024 14:37

Delegate the responsibility for batching to the caller

2c08974

Dynamically compute the batch size based on the number of expected in…

418de0a

…dex entries

Use a cleaner to make sure all documents get send to the index, even …

9519a92

…in case on of not properly closed submittors

Don't use list version of indexMetadata for single entries

a9a919e

Remove no longer required method forceIndexChanges

0d49d86

Remove commit index changes from frontend

ff5c5dc

Change how a running index job is determined

34e1607

Fixed using the wrong map in BatchingIndexSubmittor

83ea448

tobias-hotz force-pushed the async_indexing branch from 702dc20 to 83ea448 Compare November 7, 2024 14:47

tobias-hotz added 2 commits November 7, 2024 16:28

Fix field updating not refreshing

66e9c38

This causes issues with some tests. This issue was already present before the changes

Fix UserSelectionsApiTest being too strict about the submittor

948575b

tobias-hotz force-pushed the async_indexing branch from ddb7766 to 948575b Compare November 7, 2024 16:07

tobias-hotz added 3 commits November 12, 2024 14:12

Add support for batch deletion as well

95e4a57

Fix batch deletion

827539e

Allow delete by query to be "batched" by running them async

61fb7ad

tobias-hotz force-pushed the async_indexing branch from 767194b to 61fb7ad Compare November 12, 2024 13:13

fxprunayre mentioned this pull request Dec 5, 2024

Performance hot spots geonetwork/geonetwork#76

Open

tobias-hotz added 4 commits December 6, 2024 16:29

submittor -> submitter

9cd5242

Remove debug sleep

2b6476c

Remove unused never implemented that still references forceRefreshRea…

b8952a0

…ders

Improve log message when deleting

d6b9655

tobias-hotz changed the title ~~Elasticsearch Indexing: Handle index responses asynchronously~~ Elasticsearch: Async handling of indexing/deletion requests Dec 10, 2024

tobias-hotz marked this pull request as ready for review December 10, 2024 14:36

Merge remote-tracking branch 'refs/remotes/upstream/main' into async_…

c65cc71

…indexing # Conflicts: # services/src/main/java/org/fao/geonet/api/processing/DatabaseProcessUtils.java

tobias-hotz mentioned this pull request Dec 10, 2024

Error when trying to view/sign CLA via CLAassistant #8550

Closed

tobias-hotz and others added 8 commits June 3, 2025 11:17

Update core/src/main/java/org/fao/geonet/kernel/search/submission/Dir…

90fe4c6

…ectDeletionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/Dir…

5666f00

…ectIndexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/IDe…

dee4142

…letionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/IIn…

466376a

…dexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/bat…

81a0353

…ch/BatchingDeletionSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/bat…

0bd839e

…ch/BatchingIndexSubmitter.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/bat…

1ff7298

…ch/BatchingSubmitterBase.java Co-authored-by: Jose García <josegar74@gmail.com>

Update core/src/main/java/org/fao/geonet/kernel/search/submission/bat…

f0d7469

…ch/StateBase.java Co-authored-by: Jose García <josegar74@gmail.com>

tobias-hotz force-pushed the async_indexing branch from ab3f611 to e7e7466 Compare September 3, 2025 09:13

Fixes after merge

861a285

tobias-hotz force-pushed the async_indexing branch from e7e7466 to 861a285 Compare September 3, 2025 09:14

jahow modified the milestones: 4.4.9, 4.4.10 Oct 7, 2025

tobias-hotz added 4 commits October 7, 2025 11:06

Merge remote-tracking branch 'upstream/main' into async_indexing

305fa2a

# Conflicts: # core/src/test/java/org/fao/geonet/AbstractCoreIntegrationTest.java # harvesters/src/main/java/org/fao/geonet/kernel/harvest/harvester/geonet/BaseGeoNetworkAligner.java

Fixes after merge

7c0b26b

Merge remote-tracking branch 'upstream/main' into async_indexing

b5713cd

# Conflicts: # harvesters/src/main/java/org/fao/geonet/kernel/harvest/harvester/AbstractHarvester.java # plugins/datahub-integration/geonetwork-ui

Fixes after merge

24c7f17

Merge remote-tracking branch 'upstream/main' into async_indexing

140c443

rime1014 mentioned this pull request Jan 16, 2026

OGC CSW 2.0.2 Harvesting / Indexing Performance / Bulk-Request #7981

Open

fxprunayre modified the milestones: 4.4.10, 4.4.11 Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Elasticsearch: Async handling of indexing/deletion requests#8465

Elasticsearch: Async handling of indexing/deletion requests#8465
tobias-hotz wants to merge 39 commits intogeonetwork:mainfrom
tobias-hotz:async_indexing

tobias-hotz commented Oct 24, 2024 •

edited

Loading

Uh oh!

fxprunayre commented Nov 14, 2024

Uh oh!

tobias-hotz commented Nov 14, 2024

Uh oh!

CLAassistant commented Dec 8, 2024 •

edited

Loading

Uh oh!

fxprunayre commented Jun 18, 2025

Uh oh!

tobias-hotz commented Sep 3, 2025

Uh oh!

fxprunayre commented Sep 10, 2025

Uh oh!

tobias-hotz commented Nov 14, 2025

Uh oh!

tobias-hotz commented Jan 5, 2026

Uh oh!

fxprunayre commented Jan 15, 2026

Uh oh!

tobias-hotz commented Mar 26, 2026

Uh oh!

fxprunayre commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

tobias-hotz commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

fxprunayre commented Nov 14, 2024

Uh oh!

tobias-hotz commented Nov 14, 2024

Uh oh!

CLAassistant commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fxprunayre commented Jun 18, 2025

Uh oh!

tobias-hotz commented Sep 3, 2025

Uh oh!

fxprunayre commented Sep 10, 2025

Uh oh!

tobias-hotz commented Nov 14, 2025

Uh oh!

tobias-hotz commented Jan 5, 2026

Uh oh!

fxprunayre commented Jan 15, 2026

Uh oh!

tobias-hotz commented Mar 26, 2026

Uh oh!

fxprunayre commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tobias-hotz commented Oct 24, 2024 •

edited

Loading

CLAassistant commented Dec 8, 2024 •

edited

Loading