Refactor IndexRouting.ExtractFromSource to be an abstract class #135206

felixbarny · 2025-09-22T16:10:25Z

With implementations IndexRouting.ExtractFromSource.ForRoutingPath and IndexRouting.ExtractFromSource.ForIndexDimensions.

This addresses review comments from #132566 (comment).

With implementations IndexRouting.ExtractFromSource.ForRoutingPath and IndexRouting.ExtractFromSource.ForIndexDimensions. This addresses review comments from elastic#132566.

elasticsearchmachine · 2025-09-22T16:10:50Z

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

henningandersen

LGTM

server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java

henningandersen · 2025-09-22T19:22:03Z

server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java

+            protected int hashSource(IndexRequest indexRequest) {
+                BytesRef tsid = indexRequest.tsid();
+                if (tsid == null) {
+                    tsid = buildTsid(indexRequest.getContentType(), indexRequest.indexSource().bytes());


Can we do this (and the following line) unconditionally in preProcess instead?

The tsid == null condition is needed as the tsid may already be provided. See also #134982. Also, I think it makes sense to execute this in the context of hashSource, as that's what buildTsid is doing. There are several places where we assume that source parsing happens within indexShard and I think it makes sense that both strategies do that in the same method rather than different ones (preProcess vs indexShard).

…-source

felixbarny · 2025-09-23T08:38:01Z

The failing tests revealed a bug: when applying a translog operation, we don't store the tsid that was created by the coordinating node. Therefore, it falls tries to cast the settings.getIndexRouting() to a IndexRouting.ExtractFromSource.ForRoutingPath. This was masked previously by both strategies being implemented in the same class. The effect was that the _tsid (and thus the _id) is different than the one from the regular index operation, which is very bad.

Two approaches I see to fix this:

Store the tsid in the translog. Not quite sure if this works in all cases, for example CCR is used with where the follower index has a different cluster version.
Implement a version of RoutingFields that creates the tsid using index.dimensions during document parsing the same way it would during routing. This would kick in if index.dimensions is configured but the IndexRequest doesn't contain a tsid.

I think I'd prefer the first option but there may be dragons I'm not aware of.

One thing that would make the second option very difficult is that in the index.routing_path-based strategy, we hash the values after field parsing. For example, org.elasticsearch.index.mapper.RoutingFields#addIp(String fieldName, InetAddress value), whereas in the index.dimensions-based strategy, we use the raw JSON types where IP addresses are represented as strings. A lossless conversion to the representation within the source isn't always possible.

@henningandersen WDYT?

Click to expand stack trace...

java.lang.ClassCastException: class org.elasticsearch.cluster.routing.IndexRouting$ExtractFromSource$ForIndexDimensions cannot be cast to class org.elasticsearch.cluster.routing.IndexRouting$ExtractFromSource$ForRoutingPath (org.elasticsearch.cluster.routing.IndexRouting$ExtractFromSource$ForIndexDimensions and org.elasticsearch.cluster.routing.IndexRouting$ExtractFromSource$ForRoutingPath are in unnamed module of loader 'app')	
	at org.elasticsearch.index.IndexMode$2.buildRoutingFields(IndexMode.java:227) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.mapper.RoutingFields.fromIndexSettings(RoutingFields.java:26) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.mapper.DocumentParserContext.<init>(DocumentParserContext.java:274) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.mapper.DocumentParser$RootDocumentParserContext.<init>(DocumentParser.java:1083) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:99) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:128) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:1092) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1019) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2109) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2096) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:655) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:600) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:592) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:688) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:675) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:86) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:319) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:331) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1067) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]	
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090) ~[?:?]	
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614) ~[?:?]	
	at java.lang.Thread.run(Thread.java:1474) ~[?:?]

felixbarny · 2025-09-23T09:49:22Z

In 9551184, I've tried another solution where the tsid is created from source during translog replay. The advantage is that we don't need to touch the translog itself, which would also require a cascade of changes to add the tsid to Engine.Index/ParsedDocument. The downside is that translog operations have to re-calculate the tsid. Maybe that's ok as we also do that for the index.routing_path-based strategy. But it does mean that translog replay doesn't fully benefit from the index.dimensions-based strategy.

elasticsearchmachine · 2025-09-23T11:42:09Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

felixbarny · 2025-09-23T12:55:36Z

In 9c680f0, I've reverted changes specific to the translog and create the tsid in DocumentParser.RootDocumentParserContext#RootDocumentParserContext if missing. This seems like the most generic and robust solution.

…-source

test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java

server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java

server/src/main/java/org/elasticsearch/index/IndexMode.java

…-source

henningandersen

Just a stray comment.

henningandersen · 2025-09-24T08:03:36Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+                // the tsid is normally set on the coordinating node during shard routing and passed to the data node via the index request
+                // but when applying a translog operation, shard routing is not happening, and we have to create the tsid from source
+                tsid = forIndexDimensions.buildTsid(source.getXContentType(), source.source());
+            }


Ideally we would have

else { assert tsid == forIndexDimensions.buildTsid(source.getXContentType(), source.source()); } (for when dimensions are in use), but I guess this is not always upheld.

When it comes to replaying translog operations (which include the id but not the tsid), the effect of this check is similar because the id is based on the tsid, and it also checks in the ForRoutingPath case:

elasticsearch/server/src/main/java/org/elasticsearch/index/mapper/TsidExtractingIdFieldMapper.java

Lines 86 to 97 in 40f3a1c

if (context.sourceToParse().id() != null && false == context.sourceToParse().id().equals(id)) {

throw new IllegalArgumentException(

String.format(

Locale.ROOT,

"_id must be unset or set to [%s] but was [%s] because [%s] is in time_series mode",

id,

context.sourceToParse().id(),

context.indexSettings().getIndexMetadata().getIndex().getName()

)

);

}

context.id(id);

felixbarny · 2025-09-24T10:37:41Z

Henning and I have discussed a potential issue where the _id stored in the translog may differ when replaying the operation if a new dimension field gets added to the mappings in the meantime. Note that this isn't a new issue related to this change or to index.dimensions.

Thinking about it some more, maybe this can’t actually happen. IINM, we write the translog entry after the index operation, and after dynamic mapping updates have been processed. If that’s true, we can be sure that any fields present in the source of a translog entry are in the mappings already. Since you can’t make an existing field a dimension after the fact and because we route time series documents to backing indices based on the @timestamp, I think there can’t be differences in the dimension field mappings and thus the tsid when replaying the translog.

If there were any differences, we'd fail ingestion due to this check:

elasticsearch/server/src/main/java/org/elasticsearch/index/mapper/TsidExtractingIdFieldMapper.java

Lines 86 to 97 in 40f3a1c

    
           if (context.sourceToParse().id() != null && false == context.sourceToParse().id().equals(id)) { 
        
               throw new IllegalArgumentException( 
        
                   String.format( 
        
                       Locale.ROOT, 
        
                       "_id must be unset or set to [%s] but was [%s] because [%s] is in time_series mode", 
        
                       id, 
        
                       context.sourceToParse().id(), 
        
                       context.indexSettings().getIndexMetadata().getIndex().getName() 
        
                   ) 
        
               ); 
        
           } 
        
           context.id(id);

Unrelated to this PR or the new index.dimensions-based tsid creation strategy, we may want to think about including the tsid into the translog, similar to how we include the id. This would increase our confidence that there are no differences in how the tsid is created and also improves the performance of recovery and CCR as we don't need to re-calculate the tsid.

server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java

server/src/main/java/org/elasticsearch/index/mapper/TsidExtractingIdFieldMapper.java

…-source

Refactor IndexRouting.ExtractFromSource to be an abstract class

205466f

With implementations IndexRouting.ExtractFromSource.ForRoutingPath and IndexRouting.ExtractFromSource.ForIndexDimensions. This addresses review comments from elastic#132566.

felixbarny requested a review from henningandersen September 22, 2025 16:10

felixbarny added >non-issue :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Sep 22, 2025

elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Distributed Indexing Meta label for Distributed Indexing team labels Sep 22, 2025

felixbarny mentioned this pull request Sep 22, 2025

TSDB ingest performance: combine routing and tsdb hashing #132566

Merged

3 tasks

Merge branch 'main' into refactor-extract-from-source

aa15c57

henningandersen approved these changes Sep 22, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into refactor-extract-from…

d22c530

…-source

felixbarny added 3 commits September 23, 2025 10:50

Streamline constructors

45b206a

Improve IndexMode.buildRoutingFields

cdfe72a

Create tsid during translog replay

9551184

felixbarny requested a review from henningandersen September 23, 2025 09:50

kkrik-es added :StorageEngine/TSDB You know, for Metrics Team:StorageEngine labels Sep 23, 2025

Create tsid in RootDocumentParserContext if missing

9c680f0

felixbarny added 2 commits September 23, 2025 17:26

Add test cases where tsid is created in DocumentParser

591709d

Merge remote-tracking branch 'origin/main' into refactor-extract-from…

18fadf3

…-source

felixbarny requested a review from kkrik-es September 23, 2025 15:27

felixbarny self-assigned this Sep 23, 2025

kkrik-es reviewed Sep 23, 2025

View reviewed changes

test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java Outdated Show resolved Hide resolved

kkrik-es reviewed Sep 23, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java Outdated Show resolved Hide resolved

kkrik-es reviewed Sep 23, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/IndexMode.java Outdated Show resolved Hide resolved

felixbarny added 2 commits September 24, 2025 08:18

Address review comments

8b28540

Merge remote-tracking branch 'origin/main' into refactor-extract-from…

cf8ab97

…-source

elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Sep 24, 2025

henningandersen reviewed Sep 24, 2025

View reviewed changes

kkrik-es reviewed Sep 24, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java Outdated Show resolved Hide resolved

kkrik-es reviewed Sep 24, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/TsidExtractingIdFieldMapper.java Outdated Show resolved Hide resolved

kkrik-es approved these changes Sep 24, 2025

View reviewed changes

felixbarny and others added 3 commits September 24, 2025 15:57

Address review comments

58f8621

Merge remote-tracking branch 'origin/main' into refactor-extract-from…

2b5d93b

…-source

Merge branch 'main' into refactor-extract-from-source

2117dbe

felixbarny merged commit 41488ee into elastic:main Sep 25, 2025
34 checks passed

felixbarny deleted the refactor-extract-from-source branch September 25, 2025 06:01

	if (context.sourceToParse().id() != null && false == context.sourceToParse().id().equals(id)) {
	throw new IllegalArgumentException(
	String.format(
	Locale.ROOT,
	"_id must be unset or set to [%s] but was [%s] because [%s] is in time_series mode",
	id,
	context.sourceToParse().id(),
	context.indexSettings().getIndexMetadata().getIndex().getName()
	)
	);
	}
	context.id(id);

Refactor IndexRouting.ExtractFromSource to be an abstract class #135206

Refactor IndexRouting.ExtractFromSource to be an abstract class #135206

Uh oh!

Conversation

felixbarny commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 22, 2025

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

henningandersen Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

felixbarny Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

felixbarny commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felixbarny commented Sep 23, 2025

Uh oh!

elasticsearchmachine commented Sep 23, 2025

Uh oh!

felixbarny commented Sep 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

henningandersen Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

felixbarny Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

felixbarny commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

felixbarny commented Sep 22, 2025 •

edited

Loading

felixbarny commented Sep 23, 2025 •

edited

Loading

felixbarny commented Sep 24, 2025 •

edited

Loading