Skip to content

Commit 720ecfe

Browse files
dnhatnMattAlp
andauthored
Dynamically grow hash in linear counting in HLL (elastic#142047) (elastic#142347)
This change dynamically grows the hash inside each linear counting instance independently to address high memory usage in count_distinct, especially in cases with a large number of groups (1M+) that each contain a small number of distinct values (<3000). Instead of eagerly allocating 16K for each grouping, this dynamic approach serves as an immediate optimization. The long-term plan, however, is to create a new, separate HLLState for ES|QL that is more friendly to Block/Vector and ES|QL aggregations. Closes elastic#41847 Relates elastic#142333 --------- Co-authored-by: Matt <matthew.alp@elastic.co> Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co> (cherry picked from commit a0b6bb6) Co-authored-by: Matt <matthew.alp@elastic.co>
1 parent 2347778 commit 720ecfe

22 files changed

+415
-216
lines changed

docs/changelog/142047.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
area: ES|QL
2+
issues:
3+
- 41847
4+
pr: 142047
5+
summary: Dynamically grow hash in linear counting in HLL
6+
type: bug

server/src/main/java/org/elasticsearch/search/aggregations/metrics/AbstractLinearCounting.java

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,5 +96,24 @@ public interface HashesIterator {
9696
* @return the current value of the counter.
9797
*/
9898
int value();
99+
100+
HashesIterator EMPTY = new EmptyIterator();
101+
}
102+
103+
private static class EmptyIterator implements HashesIterator {
104+
@Override
105+
public int size() {
106+
return 0;
107+
}
108+
109+
@Override
110+
public boolean next() {
111+
return false;
112+
}
113+
114+
@Override
115+
public int value() {
116+
throw new IllegalStateException("empty iterator");
117+
}
99118
}
100119
}

0 commit comments

Comments
 (0)