-
Notifications
You must be signed in to change notification settings - Fork 276
Description
Hi,
I found a weird behavior on bulk request. When you have and index with for example 3 shards, all documents go to same shard.
If you put to index with 6 shards, all documents go to 2 shards.
When I put a custom routing on bulk request, documents are mixed on all shards. I think is an issue/bug with routing on bulk requests but I don't know what it could be.
I tried to reproduce without use client from kibana console but I can't reproduce the same behavior, so I think is client issue.
The code is:
List<BulkOperation> bulkOperations = new ArrayList<>();
for (MultimediaDocument document : documents) {
BulkOperation operation = BulkOperation.of(builder -> builder.index(builder1 -> builder1.index("multimedia-phash")
.id(document.getDocumentId())
.document(document)));
bulkOperations.add(operation);
}
try {
BulkResponse response = client.bulk(builder -> builder
.operations(bulkOperations)
.timeout(Time.of(builderTime -> builderTime.time("5m")))
);
} catch (IOException e) {
e.printStackTrace();
}
And the elasticsearch _cat/shards/multimedia-phash output:
index shard prirep state docs store node
multimedia-phash 5 p STARTED 0 225b node3
multimedia-phash 3 p STARTED 0 225b node1
multimedia-phash 1 p STARTED 180037 85.5mb node2
multimedia-phash 4 p STARTED 0 225b node2
multimedia-phash 2 p STARTED 0 225b node3
multimedia-phash 0 p STARTED 178963 85mb node1
The code with workaround is:
Random r = new Random();
List<BulkOperation> bulkOperations = new ArrayList<>();
for (MultimediaDocument document : documents) {
BulkOperation operation = BulkOperation.of(builder -> builder.index(builder1 -> builder1.index("multimedia-phash")
.id(document.getDocumentId())
.document(document)));
bulkOperations.add(operation);
}
try {
BulkResponse response = client.bulk(builder -> builder
.operations(bulkOperations)
.routing(String.valueOf(r.nextInt(1000)))
.timeout(Time.of(builderTime -> builderTime.time("5m")))
);
} catch (IOException e) {
e.printStackTrace();
}
And the elasticsearch _cat/shards/multimedia-phash output:
index shard prirep state docs store node
multimedia-phash 5 p STARTED 23000 11mb node3
multimedia-phash 3 p STARTED 33000 12.6mb node1
multimedia-phash 1 p STARTED 28000 10.6mb node2
multimedia-phash 4 p STARTED 30000 11.7mb node2
multimedia-phash 2 p STARTED 28000 10.6mb node3
multimedia-phash 0 p STARTED 22000 8.4mb node3
Versions:
Elasticsearch: 8.0.0
co.elastic.clients.elasticsearch-java: 8.0.0
If you need any more info, please ask me.
Thanks in advance,
Adrian.