Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Jan 23, 2025

This expands the heap attack tests for LOOKUP. Now there are three flavors:

  1. LOOKUP a single geo_point - about 30 bytes or so.
  2. LOOKUP a one mb string.
  3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it circuit breaks which works out to about 256mb of buffered results. That's sensible on our 512mb heap and likely to work ok for most folks. We'll flip to a streaming method eventually and this won't be a problem any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row. That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and safe, but if you join 1mb worth of columns in LOOKUP you are going to need bigger heaps than our test. Again, we'll move from buffering these results to streaming them and it'll work better, but for now we buffer.

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.
@nik9000 nik9000 requested a review from ivancea January 23, 2025 20:21
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@nik9000 nik9000 added the >test Issues or PRs that are addressing/adding tests label Jan 29, 2025
Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nik9000 nik9000 added auto-backport Automatically create backport pull requests when merged v9.0.0 v8.18.0 labels Jan 31, 2025
@nik9000 nik9000 enabled auto-merge (squash) January 31, 2025 17:05
@nik9000
Copy link
Member Author

nik9000 commented Jan 31, 2025

Thanks @ivancea !

@nik9000 nik9000 merged commit d9da7c9 into elastic:main Jan 31, 2025
15 of 17 checks passed
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 31, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 31, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 31, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
9.0
8.18
8.x

elasticsearchmachine pushed a commit that referenced this pull request Jan 31, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
elasticsearchmachine pushed a commit that referenced this pull request Jan 31, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
elasticsearchmachine pushed a commit that referenced this pull request Feb 3, 2025
* ESQL: Expand HeapAttack for LOOKUP

This expands the heap attack tests for LOOKUP. Now there are three
flavors:
1. LOOKUP a single geo_point - about 30 bytes or so.
2. LOOKUP a one mb string.
3. LOOKUP no fields - just JOIN to alter cardinality.

Fetching a geo_point is fine with about 500 repeated docs before it
circuit breaks which works out to about 256mb of buffered results.
That's sensible on our 512mb heap and likely to work ok for most folks.
We'll flip to a streaming method eventually and this won't be a problem
any more. But for now, we buffer.

The no lookup fields is fine with like 7500 matches per incoming row.
That's quite a lot, really.

The 1mb string is trouble! We circuit break properly which is great and
safe, but if you join 1mb worth of columns in LOOKUP you are going to
need bigger heaps than our test. Again, we'll move from buffering these
results to streaming them and it'll work better, but for now we buffer.

* updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.18.0 v8.19.0 v9.0.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants