|
| 1 | +--- |
| 2 | +title: 'Memory limit exceeded for query' |
| 3 | +description: 'Troubleshooting memory limit exceeded errors for a query' |
| 4 | +date: 2025-07-25 |
| 5 | +tags: ['Errors and Exceptions'] |
| 6 | +keywords: ['OOM', 'memory limit exceeded'] |
| 7 | +--- |
| 8 | + |
| 9 | +{frontMatter.description} |
| 10 | +{/* truncate */} |
| 11 | + |
| 12 | +import Image from '@theme/IdealImage'; |
| 13 | +import joins from '@site/static/images/knowledgebase/memory-limit-exceeded-for-query.png'; |
| 14 | + |
| 15 | +## Memory limit exceeded for query {#troubleshooting-out-of-memory-issues} |
| 16 | + |
| 17 | +As a new user, ClickHouse can often seem like magic - every query is super fast, |
| 18 | +even on the largest datasets and most ambitious queries. Invariably though, |
| 19 | +real-world usage tests even the limits of ClickHouse. Queries exceeding memory |
| 20 | +can be the result of a number of causes. Most commonly, we see large joins or |
| 21 | +aggregations on high cardinality fields. If performance is critical, and these |
| 22 | +queries are required, we often recommend users simply scale up - something |
| 23 | +ClickHouse Cloud does automatically and effortlessly to ensure your queries |
| 24 | +remain responsive. We appreciate, however, that in self-managed scenarios, |
| 25 | +this is sometimes not trivial, and maybe optimal performance is not even required. |
| 26 | +Users, in this case, have a few options. |
| 27 | + |
| 28 | +### Aggregations {#aggregations} |
| 29 | + |
| 30 | +For memory-intensive aggregations or sorting scenarios, users can use the settings |
| 31 | +[`max_bytes_before_external_group_by`](/operations/settings/settings#max_bytes_before_external_group_by) |
| 32 | +and [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort) respectively. |
| 33 | +The former of which is discussed extensively [here](/sql-reference/statements/select/group-by/#group-by-in-external-memory). |
| 34 | + |
| 35 | +In summary, this ensures any aggregations can “spill” out to disk if a memory |
| 36 | +threshold is exceeded. This will invariably impact query performance but will |
| 37 | +help ensure queries do not OOM. The latter sorting setting helps address similar |
| 38 | +issues with memory-intensive sorts. This can be particularly important in |
| 39 | +distributed environments where a coordinating node receives sorted responses |
| 40 | +from child shards. In this case, the coordinating server can be asked to sort a |
| 41 | +dataset larger than its available memory. With [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort), |
| 42 | +sorting can be allowed to spill over to disk. This setting is also helpful for |
| 43 | +cases where the user has an `ORDER BY` after a `GROUP BY` with a `LIMIT`, |
| 44 | +especially in cases where the query is distributed. |
| 45 | + |
| 46 | +### Joins {#joins} |
| 47 | + |
| 48 | +For joins, users can select different `JOIN` algorithms, which can assist in |
| 49 | +lowering the required memory. By default, joins use the hash join, which offers |
| 50 | +the most completeness with respect to features and often the best performance. |
| 51 | +This algorithm loads the right-hand table of the `JOIN` into an in-memory hash |
| 52 | +table, against which the left-hand table is then evaluated. To minimize memory, |
| 53 | +users should thus place the smaller table on the right side. This approach still |
| 54 | +has limitations in memory-bound cases, however. In these cases, `partial_merge` |
| 55 | +join can be enabled via the [`join_algorithm`](/operations/settings/settings#join_algorithm) |
| 56 | +setting. This derivative of the [sort-merge algorithm](https://en.wikipedia.org/wiki/Sort-merge_join), |
| 57 | +first sorts the right table into blocks and creates a min-max index for them. |
| 58 | +It then sorts parts of the left table by the join key and joins them over the |
| 59 | +right table. The min-max index is used to skip unneeded right table blocks. |
| 60 | +This is less memory-intensive at the expense of performance. Taking this concept |
| 61 | +further, the `full_sorting_merge` algorithm allows a `JOIN` to be performed when |
| 62 | +the right-hand side is very large and doesn't fit into memory and lookups are |
| 63 | +impossible, e.g. a complex subquery. In this case, both the right and left side |
| 64 | +are sorted on disk if they do not fit in memory, allowing large tables to be |
| 65 | +joined. |
| 66 | + |
| 67 | +<Image img={joins} size="md" alt="Joins algorithms"/> |
| 68 | + |
| 69 | +Since 20.3, ClickHouse has supported an auto value for the `join_algorithm` setting. |
| 70 | +This instructs ClickHouse to apply an adaptive join approach, where the hash-join |
| 71 | +algorithm is preferred until memory limits are violated, at which point the |
| 72 | +partial_merge algorithm is attempted. Finally, concerning joins, we encourage |
| 73 | +readers to be aware of the behavior of distributed joins and how to minimize |
| 74 | +their memory consumption. More information can be found [here](/sql-reference/operators/in#distributed-subqueries). |
| 75 | + |
| 76 | + |
0 commit comments