-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Keep ordinals in conversion functions #125357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
(operation) Mode Score Error -> Score Error Units
to_lower 30.662 ± 6.163 -> 30.048 ± 0.479 ns/op
to_lower_ords 30.773 ± 0.370 -> 0.025 ± 0.001 ns/op
to_upper 33.552 ± 0.529 -> 35.775 ± 1.799 ns/op
to_upper_ords 35.791 ± 0.658 -> 0.027 ± 0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.
Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| FOO | 0 |
| BAR | 1 |
| BAZ | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
foo
bar
baz
bar
bar
foo
```
So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| foo | 0 |
| bar | 1 |
| baz | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @nik9000, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| Map<BytesRef, Integer> dedupe = new HashMap<>(); | ||
| BytesRefBlock bytesRefBlock = (BytesRefBlock) block; | ||
| try ( | ||
| IntBlock.Builder ordinals = block.blockFactory().newIntBlockBuilder(block.getPositionCount()); | ||
| BytesRefVector.Builder bytes = block.blockFactory().newBytesRefVectorBuilder(block.getPositionCount()) | ||
| ) { | ||
| BytesRef scratch = new BytesRef(); | ||
| for (int p = 0; p < block.getPositionCount(); p++) { | ||
| int first = block.getFirstValueIndex(p); | ||
| int count = block.getValueCount(p); | ||
| if (count == 0) { | ||
| ordinals.appendNull(); | ||
| continue; | ||
| } | ||
| if (count == 1) { | ||
| BytesRef v = bytesRefBlock.getBytesRef(first, scratch); | ||
| ordinals.appendInt(dedupe(dedupe, bytes, v)); | ||
| continue; | ||
| } | ||
| int end = first + count; | ||
| ordinals.beginPositionEntry(); | ||
| for (int i = first; i < end; i++) { | ||
| BytesRef v = bytesRefBlock.getBytesRef(i, scratch); | ||
| ordinals.appendInt(dedupe(dedupe, bytes, v)); | ||
| } | ||
| ordinals.endPositionEntry(); | ||
| } | ||
| blocks[b] = new OrdinalBytesRefBlock(ordinals.build(), bytes.build()); | ||
| bytesRefBlock.decRef(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should move this to BlockUtils or BlockTestUtils preemptively? A BlockUtils.toOrdinals(BytesRefBlock). Or to BytesRefBlock.
Maybe not now, but I'm not sure if we'll remember that this code is here in the future, if we need this again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BlockTestUtils seems like a good place for it.
| ) { | ||
| this.declarationType = (TypeElement) processFunction.getEnclosingElement(); | ||
| this.processFunction = new EvaluatorImplementer.ProcessFunction(types, processFunction, warnExceptions); | ||
| this.canProcessOrdinals = warnExceptions.isEmpty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this warnExceptions.isEmpty() check, I would guess that creating a new IntBlock to insert the nulls instead of reusing the existing ordinals, would still be better than what we do now? Maybe for a continuation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - that's basically what we'd do. It's a little tricky to know where to put it. Because you'd process the bytes one by one and not know which one links to them. I think we'd want a separate path if there are warnExceptions.
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
(operation) Mode Score Error -> Score Error Units
to_lower 30.662 ± 6.163 -> 30.048 ± 0.479 ns/op
to_lower_ords 30.773 ± 0.370 -> 0.025 ± 0.001 ns/op
to_upper 33.552 ± 0.529 -> 35.775 ± 1.799 ns/op
to_upper_ords 35.791 ± 0.658 -> 0.027 ± 0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.
Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| FOO | 0 |
| BAR | 1 |
| BAZ | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
foo
bar
baz
bar
bar
foo
```
So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| foo | 0 |
| bar | 1 |
| baz | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
(operation) Mode Score Error -> Score Error Units
to_lower 30.662 ± 6.163 -> 30.048 ± 0.479 ns/op
to_lower_ords 30.773 ± 0.370 -> 0.025 ± 0.001 ns/op
to_upper 33.552 ± 0.529 -> 35.775 ± 1.799 ns/op
to_upper_ords 35.791 ± 0.658 -> 0.027 ± 0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.
Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| FOO | 0 |
| BAR | 1 |
| BAZ | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
foo
bar
baz
bar
bar
foo
```
So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| foo | 0 |
| bar | 1 |
| baz | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
(operation) Mode Score Error -> Score Error Units
to_lower 30.662 ± 6.163 -> 30.048 ± 0.479 ns/op
to_lower_ords 30.773 ± 0.370 -> 0.025 ± 0.001 ns/op
to_upper 33.552 ± 0.529 -> 35.775 ± 1.799 ns/op
to_upper_ords 35.791 ± 0.658 -> 0.027 ± 0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.
Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| FOO | 0 |
| BAR | 1 |
| BAZ | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
foo
bar
baz
bar
bar
foo
```
So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
| foo | 0 |
| bar | 1 |
| baz | 2 |
+-------+ 1 |
| 1 |
| 0 |
+----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
Make the conversion functions that process
BytesRefs intoBytesRefskeep theOrdinalBytesRefVectors when processing. Let's useTO_LOWERas an example. First, the performance numbers:The test has a 8192 positions containing alternating
fooandbar. RunningTO_LOWERvia ordinals is super duper faster. No longerO(positions)and nowO(unique_values).Let's paint some pictures!
OrdinalBytesRefVectoris a lookup table. Like this:That lookup table is one block. When you read it you look up the
ordinaland match it to thebytes. PreviouslyTO_LOWERwould process each value one at a time and make:So it'd run
TO_LOWERonce perordinaland it'd make an ordinal non-lookup table. With this changeTO_LOWERwill now make:We don't even have to copy the
ordinals- we can reuse those from the input and just bump the reference count. That's why this goes fromO(positions)toO(unique_values).