- 
                Notifications
    
You must be signed in to change notification settings  - Fork 25.6k
 
ESQL: Make field fusion generic #137382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ESQL: Make field fusion generic #137382
Changes from 6 commits
a3c526e
              7758ab9
              47c874e
              6b0fead
              1dd6d96
              ce8e6aa
              d50a74b
              8746cfa
              15eb5e9
              d01184b
              d6897d8
              e091312
              538b72b
              2fc8fd5
              e310d4b
              bb81e43
              bb9bca2
              d70bca9
              5fb351c
              39ae15a
              90005ca
              b98935c
              c462d07
              667e58a
              File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| pr: 137382 | ||
| summary: Make field fusion generic | ||
| area: ES|QL | ||
| type: enhancement | ||
| issues: [] | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the "Elastic License | ||
| * 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side | ||
| * Public License v 1"; you may not use this file except in compliance with, at | ||
| * your election, the "Elastic License 2.0", the "GNU Affero General Public | ||
| * License v3.0 only", or the "Server Side Public License, v 1". | ||
| */ | ||
| 
     | 
||
| package org.elasticsearch.index.mapper.blockloader; | ||
| 
     | 
||
| import org.elasticsearch.index.mapper.MappedFieldType; | ||
| 
     | 
||
| /** | ||
| * Configuration needed to transform loaded values into blocks. | ||
| * {@link MappedFieldType}s will find me in | ||
| * {@link MappedFieldType.BlockLoaderContext#blockLoaderFunctionConfig()} and | ||
| * use this configuration to choose the appropriate implementation for | ||
| * transforming loaded values into blocks. | ||
| */ | ||
| public interface BlockLoaderFunctionConfig { | ||
| record Named(String name, Warnings warnings) implements BlockLoaderFunctionConfig { | ||
| @Override | ||
| public int hashCode() { | ||
| return name.hashCode(); | ||
| } | ||
| 
     | 
||
| @Override | ||
| public boolean equals(Object o) { | ||
| if (o == null || getClass() != o.getClass()) { | ||
| return false; | ||
| } | ||
| Named named = (Named) o; | ||
| return name.equals(named.name); | ||
| } | ||
| } | ||
| } | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the Elastic License | ||
| * 2.0; you may not use this file except in compliance with the Elastic License | ||
| * 2.0. | ||
| */ | ||
| 
     | 
||
| package org.elasticsearch.xpack.esql.expression.function; | ||
| 
     | 
||
| import org.elasticsearch.compute.operator.DriverContext; | ||
| import org.elasticsearch.compute.operator.Warnings; | ||
| import org.elasticsearch.xpack.esql.core.tree.Source; | ||
| 
     | 
||
| public class BlockLoaderWarnings implements org.elasticsearch.index.mapper.blockloader.Warnings { | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess these are warnings that can be created when using BlockLoader function config to load values. Should we add that to the javadoc? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍  | 
||
| private final DriverContext.WarningsMode warningsMode; | ||
| private final Source source; | ||
| private Warnings delegate; | ||
| 
     | 
||
| public BlockLoaderWarnings(DriverContext.WarningsMode warningsMode, Source source) { | ||
| this.warningsMode = warningsMode; | ||
| this.source = source; | ||
| } | ||
| 
     | 
||
| @Override | ||
| public void registerException(Class<? extends Exception> exceptionClass, String message) { | ||
| if (delegate == null) { | ||
| delegate = Warnings.createOnlyWarnings( | ||
| warningsMode, | ||
| source.source().getLineNumber(), | ||
| source.source().getColumnNumber(), | ||
| source.text() | ||
| ); | ||
| } | ||
| delegate.registerException(exceptionClass, message); | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| 
          
            
          
           | 
    @@ -7,17 +7,50 @@ | |
| 
     | 
||
| package org.elasticsearch.xpack.esql.expression.function.blockloader; | ||
| 
     | 
||
| import org.elasticsearch.index.mapper.MappedFieldType; | ||
| import org.elasticsearch.compute.data.Block; | ||
| import org.elasticsearch.core.Nullable; | ||
| import org.elasticsearch.index.mapper.blockloader.BlockLoaderFunctionConfig; | ||
| import org.elasticsearch.xpack.esql.core.expression.Expression; | ||
| import org.elasticsearch.xpack.esql.core.expression.FieldAttribute; | ||
| import org.elasticsearch.xpack.esql.stats.SearchStats; | ||
| 
     | 
||
| /** | ||
| * {@link org.elasticsearch.xpack.esql.core.expression.Expression}s that can be implemented as part of value loading implement this | ||
| * interface to provide the {@link MappedFieldType.BlockLoaderFunctionConfig} that will be used to load and | ||
| * transform the value of the field. | ||
| * {@link Expression} that can be "fused" into value loading. Most of the time | ||
| * we load values into {@link Block}s and then run the expressions on them, but | ||
| * sometimes it's worth short-circuiting this process and running the expression | ||
| * in the tight loop we use for loading: | ||
| * <ul> | ||
| * <li> | ||
| * {@code V_COSINE(vector, [constant_vector])} - vector is ~512 floats | ||
| * and V_COSINE is one double. | ||
| * </li> | ||
| * <li> | ||
| * {@code ST_CENTROID(shape)} - shapes can be quite large. Centroids | ||
| * are just one point. | ||
| * </li> | ||
| * <li> | ||
| * {@code LENGTH(string)} - strings can be quite long, but string length | ||
| * is always an int. For more fun, {@code keyword}s are usually stored | ||
| * using a dictionary, and it's <strong>fairly</strong> easy to optimize | ||
| * running {@code LENGTH} once per dictionary entry. | ||
| * </li> | ||
| * <li> | ||
| * {@code MV_COUNT(anything)} - counts are always integers. | ||
| * </li> | ||
| * </ul> | ||
| */ | ||
| public interface BlockLoaderExpression { | ||
| /** | ||
| * The field and loading configuration that replaces this expression, effectively | ||
| * "fusing" the expression into the load. Or null if the fusion isn't possible. | ||
| */ | ||
| @Nullable | ||
| Fuse tryFuse(SearchStats stats); | ||
                
       | 
||
| 
     | 
||
| /** | ||
| * Returns the configuration that will be used to load the value of the field and transform it | ||
| * Fused load configuration. | ||
| * @param field the field whose load we're fusing into | ||
| * @param config the fusion configuration | ||
| */ | ||
| MappedFieldType.BlockLoaderFunctionConfig getBlockLoaderFunctionConfig(); | ||
| record Fuse(FieldAttribute field, BlockLoaderFunctionConfig config) {} | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bothers me. I needed this because without it we'd try to push this:
to the index. Now, we might be able to do that with a specialized lucene query. But we don't have one of those. Without those change instead what happens is:
LENGTH(kwd)becomes$$kwd$length$hash$.$$kwd$length$hash$ < 10as pushable.This tells us we can't push it. But it's kind of picky. If
SearchStatstookEsFieldit could check this easy enough. That might be a good solution to this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
MultiTypeEsFieldis created withaggregatable=false, so that predicates on it don't get pushed down incorrectly.Adding
pushableshould also work.