-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Is your feature request related to a problem? Please describe
We observed high latencies on some of our search requests that use terms queries with large term lists (thousands of terms) when queries hit many shards. Hot threads showed significant CPU time spent in the transport serialization path:
Node: data-eu-south-2b-1-19 (HsYQ-G0FRmam5NrgnKQoqw)
Thread: opensearch[data-eu-south-2b-1-19][transport_worker][T#15]
CPU: 100.3% (1s out of 1s)
Stack trace:
java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3383)
java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3483)
org.opensearch.core.common.io.stream.Writeable$WriteableRegistry.getCustomClassFromInstance(Writeable.java:110)
org.opensearch.core.common.io.stream.StreamOutput.getGenericType(StreamOutput.java:791)
org.opensearch.core.common.io.stream.StreamOutput.writeGenericValue(StreamOutput.java:837)
org.opensearch.core.common.io.stream.StreamOutput.lambda$static$10(StreamOutput.java:703)
org.opensearch.core.common.io.stream.StreamOutput.writeGenericValue(StreamOutput.java:839)
org.opensearch.index.query.TermsQueryBuilder.doWriteTo(TermsQueryBuilder.java:215)
...
The cost multiplies with the number of shards since the query is serialized once per shard.
Describe the solution you'd like
TermsQueryBuilder already has an optimization inconvert() that compacts homogeneous term lists into efficient representations:
- All numbers β backed by
long[] - All strings β backed by single
BytesReference + int[] offsets
TermsQueryBuilder.java#L332-L387
However, this optimization isn't utilized during serialization. The doWriteTo() method calls out.writeGenericValue(values) which triggers the generic List writer. For each element, writeGenericValue calls getGenericType() β WriteableRegistry.getCustomClassFromInstance() which iterates over all registered custom classes in a ConcurrentHashMap.
We can optimize this by adding specialized serialization in TermsQueryBuilder that detects the compact list representations and uses bulk serialization:
FORMAT_LONG (1): marker byte + long[] array (bulk write)
FORMAT_STRING (2): marker byte + BytesReference + int[] offsets (bulk write)
FORMAT_GENERIC (0): marker byte + writeGenericValue (existing fallback)
This requires detecting the compact AbstractList implementations created by convert() and serializing their backing data directly rather than element-by-element.
We tried this internally on our clusters (for some very specific call patterns with large terms list and many shards in scope):
| Metric | Before | After | Improvement |
|---|---|---|---|
| P50 | 3378ms | 2124ms | 37% |
| P90 | 5486ms | 3112ms | 43% |
| P99 | 6594ms | 3473ms | 47% |
Looking for inputs on this approach. I can make pull request if it sounds good.
Related component
Search:Performance
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status