Skip to content

Commit c340ba5

Browse files
authored
ESQL: Make LOOKUP more left-joiny (#118889) (#119475) (#120232)
* ESQL: Compute infrastruture for LEFT JOIN (#118889) This adds some infrastructure that we can use to run LOOKUP JOIN using real LEFT JOIN semantics. Right now if LOOKUP JOIN matches many rows in the `lookup` index we merge all of the values into a multivalued field. So the number of rows emitted from LOOKUP JOIN is the same as the number of rows that comes into LOOKUP JOIN. This change builds the infrastructure to emit one row per match, mostly reusing the infrastructure from ENRICH. * ESQL: Make LOOKUP more left-joiny (#119475) This makes `LOOKUP` return multiple rows if there are multiple matches. This is the way SQL works so it's *probably* what folks will expect. Even if it isn't, it allows for more optimizations. Like, this change doesn't optimize anything - it just changes the behavior. But there are optimizations you can do *later* that are transparent when we have *this* behavior, but not with the old behavior. Example: ``` - 2 | [German, German, German] | [Austria, Germany, Switzerland] + 2 | German | [Austria, Germany] + 2 | German | Switzerland + 2 | German | null ``` Relates: #118781
1 parent 106bc7b commit c340ba5

File tree

42 files changed

+1732
-303
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+1732
-303
lines changed

server/src/main/java/org/elasticsearch/action/admin/cluster/node/capabilities/NodeCapability.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,4 +41,9 @@ public void writeTo(StreamOutput out) throws IOException {
4141

4242
out.writeBoolean(supported);
4343
}
44+
45+
@Override
46+
public String toString() {
47+
return "NodeCapability{supported=" + supported + '}';
48+
}
4449
}

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/BooleanArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/BytesRefArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/DoubleArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/FloatArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/IntArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/LongArrayBlock.java

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/Block.java

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,10 +212,46 @@ default boolean mvSortedAscending() {
212212
/**
213213
* Expand multivalued fields into one row per value. Returns the same block if there aren't any multivalued
214214
* fields to expand. The returned block needs to be closed by the caller to release the block's resources.
215-
* TODO: pass BlockFactory
216215
*/
217216
Block expand();
218217

218+
/**
219+
* Build a {@link Block} with a {@code null} inserted {@code before} each
220+
* listed position.
221+
* <p>
222+
* Note: {@code before} must be non-decreasing.
223+
* </p>
224+
*/
225+
default Block insertNulls(IntVector before) {
226+
// TODO remove default and scatter to implementation where it can be a lot more efficient
227+
int myCount = getPositionCount();
228+
int beforeCount = before.getPositionCount();
229+
try (Builder builder = elementType().newBlockBuilder(myCount + beforeCount, blockFactory())) {
230+
int beforeP = 0;
231+
int nextNull = before.getInt(beforeP);
232+
for (int mainP = 0; mainP < myCount; mainP++) {
233+
while (mainP == nextNull) {
234+
builder.appendNull();
235+
beforeP++;
236+
if (beforeP >= beforeCount) {
237+
builder.copyFrom(this, mainP, myCount);
238+
return builder.build();
239+
}
240+
nextNull = before.getInt(beforeP);
241+
}
242+
// This line right below this is the super inefficient one.
243+
builder.copyFrom(this, mainP, mainP + 1);
244+
}
245+
assert nextNull == myCount;
246+
while (beforeP < beforeCount) {
247+
nextNull = before.getInt(beforeP++);
248+
assert nextNull == myCount;
249+
builder.appendNull();
250+
}
251+
return builder.build();
252+
}
253+
}
254+
219255
/**
220256
* Builds {@link Block}s. Typically, you use one of it's direct supinterfaces like {@link IntBlock.Builder}.
221257
* This is {@link Releasable} and should be released after building the block or if building the block fails.

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/OrdinalBytesRefBlock.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -246,4 +246,9 @@ public OrdinalBytesRefBlock expand() {
246246
public long ramBytesUsed() {
247247
return ordinals.ramBytesUsed() + bytes.ramBytesUsed();
248248
}
249+
250+
@Override
251+
public String toString() {
252+
return getClass().getSimpleName() + "[ordinals=" + ordinals + ", bytes=" + bytes + "]";
253+
}
249254
}

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/X-ArrayBlock.java.st

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ $endif$
149149
int valueCount = getValueCount(pos);
150150
int first = getFirstValueIndex(pos);
151151
if (valueCount == 1) {
152-
builder.append$Type$(get$Type$(getFirstValueIndex(pos)$if(BytesRef)$, scratch$endif$));
152+
builder.append$Type$(get$Type$(first$if(BytesRef)$, scratch$endif$));
153153
} else {
154154
builder.beginPositionEntry();
155155
for (int c = 0; c < valueCount; c++) {

0 commit comments

Comments
 (0)