Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Mar 21, 2025

Make the conversion functions that process BytesRefs into BytesRefs keep the OrdinalBytesRefVectors when processing. Let's use TO_LOWER as an example. First, the performance numbers:

  (operation)  Mode   Score   Error ->  Score    Error Units
     to_lower  30.662 ± 6.163 -> 30.048 ±  0.479 ns/op
to_lower_ords  30.773 ± 0.370 ->  0.025 ±  0.001 ns/op
     to_upper  33.552 ± 0.529 -> 35.775 ±  1.799 ns/op
to_upper_ords  35.791 ± 0.658 ->  0.027 ±  0.001 ns/op

The test has a 8192 positions containing alternating foo and bar. Running TO_LOWER via ordinals is super duper faster. No longer O(positions) and now O(unique_values).

Let's paint some pictures! OrdinalBytesRefVector is a lookup table. Like this:

+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  FOO  | 0        |
|  BAR  | 1        |
|  BAZ  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+

That lookup table is one block. When you read it you look up the ordinal and match it to the bytes. Previously TO_LOWER would process each value one at a time and make:

bytes
-----
 foo
 bar
 baz
 bar
 bar
 foo

So it'd run TO_LOWER once per ordinal and it'd make an ordinal non-lookup table. With this change TO_LOWER will now make:

+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  foo  | 0        |
|  bar  | 1        |
|  baz  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+

We don't even have to copy the ordinals - we can reuse those from the input and just bump the reference count. That's why this goes from O(positions) to O(unique_values).

Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
  (operation)  Mode   Score   Error ->  Score    Error Units
     to_lower  30.662 ± 6.163 -> 30.048 ±  0.479 ns/op
to_lower_ords  30.773 ± 0.370 ->  0.025 ±  0.001 ns/op
     to_upper  33.552 ± 0.529 -> 35.775 ±  1.799 ns/op
to_upper_ords  35.791 ± 0.658 ->  0.027 ±  0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.

Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  FOO  | 0        |
|  BAR  | 1        |
|  BAZ  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```

That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
 foo
 bar
 baz
 bar
 bar
 foo
```

So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  foo  | 0        |
|  bar  | 1        |
|  baz  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
@nik9000 nik9000 requested a review from ivancea March 21, 2025 01:14
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 21, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 589 to 619
Map<BytesRef, Integer> dedupe = new HashMap<>();
BytesRefBlock bytesRefBlock = (BytesRefBlock) block;
try (
IntBlock.Builder ordinals = block.blockFactory().newIntBlockBuilder(block.getPositionCount());
BytesRefVector.Builder bytes = block.blockFactory().newBytesRefVectorBuilder(block.getPositionCount())
) {
BytesRef scratch = new BytesRef();
for (int p = 0; p < block.getPositionCount(); p++) {
int first = block.getFirstValueIndex(p);
int count = block.getValueCount(p);
if (count == 0) {
ordinals.appendNull();
continue;
}
if (count == 1) {
BytesRef v = bytesRefBlock.getBytesRef(first, scratch);
ordinals.appendInt(dedupe(dedupe, bytes, v));
continue;
}
int end = first + count;
ordinals.beginPositionEntry();
for (int i = first; i < end; i++) {
BytesRef v = bytesRefBlock.getBytesRef(i, scratch);
ordinals.appendInt(dedupe(dedupe, bytes, v));
}
ordinals.endPositionEntry();
}
blocks[b] = new OrdinalBytesRefBlock(ordinals.build(), bytes.build());
bytesRefBlock.decRef();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move this to BlockUtils or BlockTestUtils preemptively? A BlockUtils.toOrdinals(BytesRefBlock). Or to BytesRefBlock.
Maybe not now, but I'm not sure if we'll remember that this code is here in the future, if we need this again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockTestUtils seems like a good place for it.

) {
this.declarationType = (TypeElement) processFunction.getEnclosingElement();
this.processFunction = new EvaluatorImplementer.ProcessFunction(types, processFunction, warnExceptions);
this.canProcessOrdinals = warnExceptions.isEmpty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About this warnExceptions.isEmpty() check, I would guess that creating a new IntBlock to insert the nulls instead of reusing the existing ordinals, would still be better than what we do now? Maybe for a continuation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - that's basically what we'd do. It's a little tricky to know where to put it. Because you'd process the bytes one by one and not know which one links to them. I think we'd want a separate path if there are warnExceptions.

@nik9000 nik9000 enabled auto-merge (squash) March 21, 2025 15:48
@nik9000 nik9000 merged commit c5e7684 into elastic:main Mar 21, 2025
16 of 17 checks passed
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
  (operation)  Mode   Score   Error ->  Score    Error Units
     to_lower  30.662 ± 6.163 -> 30.048 ±  0.479 ns/op
to_lower_ords  30.773 ± 0.370 ->  0.025 ±  0.001 ns/op
     to_upper  33.552 ± 0.529 -> 35.775 ±  1.799 ns/op
to_upper_ords  35.791 ± 0.658 ->  0.027 ±  0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.

Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  FOO  | 0        |
|  BAR  | 1        |
|  BAZ  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```

That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
 foo
 bar
 baz
 bar
 bar
 foo
```

So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  foo  | 0        |
|  bar  | 1        |
|  baz  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 8, 2025
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
  (operation)  Mode   Score   Error ->  Score    Error Units
     to_lower  30.662 ± 6.163 -> 30.048 ±  0.479 ns/op
to_lower_ords  30.773 ± 0.370 ->  0.025 ±  0.001 ns/op
     to_upper  33.552 ± 0.529 -> 35.775 ±  1.799 ns/op
to_upper_ords  35.791 ± 0.658 ->  0.027 ±  0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.

Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  FOO  | 0        |
|  BAR  | 1        |
|  BAZ  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```

That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
 foo
 bar
 baz
 bar
 bar
 foo
```

So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  foo  | 0        |
|  bar  | 1        |
|  baz  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
elasticsearchmachine pushed a commit that referenced this pull request May 8, 2025
Make the conversion functions that process `BytesRef`s into `BytesRefs`
keep the `OrdinalBytesRefVector`s when processing. Let's use `TO_LOWER`
as an example. First, the performance numbers:
```
  (operation)  Mode   Score   Error ->  Score    Error Units
     to_lower  30.662 ± 6.163 -> 30.048 ±  0.479 ns/op
to_lower_ords  30.773 ± 0.370 ->  0.025 ±  0.001 ns/op
     to_upper  33.552 ± 0.529 -> 35.775 ±  1.799 ns/op
to_upper_ords  35.791 ± 0.658 ->  0.027 ±  0.001 ns/op
```
The test has a 8192 positions containing alternating `foo` and `bar`.
Running `TO_LOWER` via ordinals is super duper faster. No longer
`O(positions)` and now `O(unique_values)`.

Let's paint some pictures! `OrdinalBytesRefVector` is a lookup table.
Like this:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  FOO  | 0        |
|  BAR  | 1        |
|  BAZ  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```

That lookup table is one block. When you read it you look up the
`ordinal` and match it to the `bytes`. Previously `TO_LOWER` would
process each value one at a time and make:
```
bytes
-----
 foo
 bar
 baz
 bar
 bar
 foo
```

So it'd run `TO_LOWER` once per `ordinal` and it'd make an ordinal
non-lookup table. With this change `TO_LOWER` will now make:
```
+-------+----------+
| bytes | ordinals |
| ----- | -------- |
|  foo  | 0        |
|  bar  | 1        |
|  baz  | 2        |
+-------+ 1        |
        | 1        |
        | 0        |
        +----------+
```
We don't even have to copy the `ordinals` - we can reuse those from the
input and just bump the reference count. That's why this goes from
`O(positions)` to `O(unique_values)`.
@nik9000 nik9000 added the v8.19.0 label May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants