Skip to content

[BUG] Include/Exclude on terms aggregation can cause IndexOutOfBoundsException #19636

@harshavamsi

Description

@harshavamsi

Describe the bug

While doing an include/exclude on a terms aggregation, we do an ordinal lookup on the prefixes. We get the startOrd and then if there isn't an exact match we do startOrd = -1 - startOrd. This can cause the ordinal values to be out of bounds when seeking.

We should be doing the length check before we seek

                if (startOrd >= length) {
                    continue;
                }

Sample stack trace

 #[org.opensearch.core.common.io.stream.NotSerializableExceptionWrapper]#All shards failed for phase: [query]
NotSerializableExceptionWrapper[index_out_of_bounds_exception: null]
    at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$TermsDict.seekExact(Lucene90DocValuesProducer.java:1182)
    at org.opensearch.index.fielddata.ordinals.GlobalOrdinalMapping.lookupOrd(GlobalOrdinalMapping.java:106)
    at org.opensearch.search.aggregations.bucket.terms.IncludeExclude$PrefixBackedOrdinalsFilter.process(IncludeExclude.java:444)
    at org.opensearch.search.aggregations.bucket.terms.IncludeExclude$PrefixBackedOrdinalsFilter.acceptedGlobalOrdinals(IncludeExclude.java:485)
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.<init>(GlobalOrdinalsStringTermsAggregator.java:138)
    at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory$ExecutionMode$2.create(TermsAggregatorFactory.java:502)
    at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory$1.build(TermsAggregatorFactory.java:140)
    at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory.doCreateInternal(TermsAggregatorFactory.java:310)
    at org.opensearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:76)
    at org.opensearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:103)
    at org.opensearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:315)
    at org.opensearch.search.aggregations.AggregatorFactories.createTopLevelNonGlobalAggregators(AggregatorFactories.java:301)
    at org.opensearch.search.aggregations.AggregationCollectorManager.newCollector(AggregationCollectorManager.java:45)
    at org.opensearch.search.aggregations.NonGlobalAggCollectorManager.<init>(NonGlobalAggCollectorManager.java:32)
    at org.opensearch.search.aggregations.ConcurrentAggregationProcessor.preProcess(ConcurrentAggregationProcessor.java:41)
    at org.opensearch.neuralsearch.search.query.HybridAggregationProcessor.preProcess(HybridAggregationProcessor.java:32)
    at org.opensearch.search.query.QueryPhase.execute(QueryPhase.java:154)
    at org.opensearch.indices.IndicesService.lambda$loadIntoContext$24(IndicesService.java:1938)
    at org.opensearch.indices.IndicesService.lambda$cacheShardLevelResult$25(IndicesService.java:1999)
    at org.opensearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:379)
    at org.opensearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:362)
    at org.opensearch.cache.common.tier.TieredSpilloverCache$TieredSpilloverCacheSegment.compute(TieredSpilloverCache.java:385)
    at org.opensearch.cache.common.tier.TieredSpilloverCache$TieredSpilloverCacheSegment.computeIfAbsent(TieredSpilloverCache.java:307)
    at org.opensearch.cache.common.tier.TieredSpilloverCache.computeIfAbsent(TieredSpilloverCache.java:647)
    at org.opensearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:323)
    at org.opensearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:2005)
    at org.opensearch.indices.IndicesService.loadIntoContext(IndicesService.java:1936)
    at org.opensearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:657)
    at org.opensearch.search.SearchService.executeQueryPhase(SearchService.java:723)
    at org.opensearch.search.SearchService$2.lambda$onResponse$0(SearchService.java:692)
    at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74)
    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)
    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.lang.Thread.run(Thread.java:1583)

Related component

No response

To Reproduce

Sample test

    public void testPrefixFilterWithNonExistentExcludePrefixBeyondRange() throws IOException {
        // Test with an exclude pattern where the prefix doesn't exist
        IncludeExclude includeExclude = new IncludeExclude(null, "zzz.*");

        OrdinalsFilter ordinalsFilter = includeExclude.convertToOrdinalsFilter(DocValueFormat.RAW);

        BytesRef[] bytesRefs = toBytesRefArray("aaa", "bbb", "ccc");

        SortedSetDocValues sortedSetDocValues = new AbstractSortedSetDocValues() {
            @Override
            public boolean advanceExact(int target) {
                return false;
            }

            @Override
            public long nextOrd() {
                return 0;
            }

            @Override
            public int docValueCount() {
                return 1;
            }

            @Override
            public BytesRef lookupOrd(long ord) {
                if (ord < 0 || ord >= bytesRefs.length) {
                    throw new IndexOutOfBoundsException("ord=" + ord + " is out of bounds [0," + bytesRefs.length + ")");
                }
                int ordIndex = Math.toIntExact(ord);
                return bytesRefs[ordIndex];
            }

            @Override
            public long getValueCount() {
                return bytesRefs.length;
            }
        };

        // This should throw IndexOutOfBoundsException due to the bug
        IndexOutOfBoundsException exception = assertThrows(
            IndexOutOfBoundsException.class,
            () -> ordinalsFilter.acceptedGlobalOrdinals(sortedSetDocValues)
        );
        assertTrue(
            "Exception message should indicate out of bounds access",
            exception.getMessage().contains("out of bounds")
        );
    }

Expected behavior

Expect that we do not throw the index out of bounds exception

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions