Skip to content

Commit ba31126

Browse files
authored
Merge branch 'main' into styling_improvements
2 parents b578074 + 7044979 commit ba31126

26 files changed

+360
-148
lines changed

docs/cloud/manage/backups/configurable-backups.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ ClickHouse Cloud allows you to configure the schedule for your backups for **Sca
2424
The custom schedule will override the default backup policy in ClickHouse Cloud for your given service.
2525
:::
2626

27+
:::note
28+
In some rare scenarios, the backup scheduler will not respect the **Start Time** specified for backups. Specifically, this happens if there was a successful backup triggered < 24 hours from the time of the currently scheduled backup. This could happen due to a retry mechanism we have in place for backups. In such instances, the scheduler will skip over the backup for the current day, and will retry the backup the next day at the scheduled time.
29+
:::
30+
2731
To configure the backup schedule for a service, go to the **Settings** tab in the console and click on **Change backup configuration**.
2832

2933
<Image img={backup_settings} size="lg" alt="Configure backup settings" border/>

docs/cloud/manage/billing.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,10 @@ ClickHouse Cloud supports the following billing options:
223223
- Direct-sales annual / multi-year (through pre-paid "ClickHouse Credits", in USD, with additional payment options).
224224
- Through the AWS, GCP, and Azure marketplaces (either pay-as-you-go (PAYG) or commit to a contract with ClickHouse Cloud through the marketplace).
225225

226+
:::note
227+
ClickHouse Cloud credits for PAYG are invoiced in \$0.01 units, allowing us to charge customers for partial ClickHouse credits based on their usage. This differs from committed spend ClickHouse credits, which are purchased in advance in whole \$1 units.
228+
:::
229+
226230
### How long is the billing cycle? {#how-long-is-the-billing-cycle}
227231

228232
Billing follows a monthly billing cycle and the start date is tracked as the date when the ClickHouse Cloud organization was created.
@@ -454,12 +458,12 @@ This section outlines the pricing model of ClickPipes for streaming and object s
454458

455459
#### What does the ClickPipes pricing structure look like? {#what-does-the-clickpipes-pricing-structure-look-like}
456460

457-
It consists of two dimensions
461+
It consists of two dimensions:
458462

459-
- **Compute**: Price per unit per hour
463+
- **Compute**: Price **per unit per hour**.
460464
Compute represents the cost of running the ClickPipes replica pods whether they actively ingest data or not.
461465
It applies to all ClickPipes types.
462-
- **Ingested data**: per GB pricing
466+
- **Ingested data**: Price **per GB**.
463467
The ingested data rate applies to all streaming ClickPipes
464468
(Kafka, Confluent, Amazon MSK, Amazon Kinesis, Redpanda, WarpStream, Azure Event Hubs)
465469
for the data transferred via the replica pods. The ingested data size (GB) is charged based on bytes received from the source (uncompressed or compressed).
@@ -472,17 +476,27 @@ For this reason, it uses dedicated compute replicas.
472476

473477
#### What is the default number of replicas and their size? {#what-is-the-default-number-of-replicas-and-their-size}
474478

475-
Each ClickPipe defaults to 1 replica that is provided with 2 GiB of RAM and 0.5 vCPU.
476-
This corresponds to **0.25** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCPUs).
479+
Each ClickPipe defaults to 1 replica that is provided with 512 MiB of RAM and 0.125 vCPU (XS).
480+
This corresponds to **0.0625** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCPUs).
477481

478482
#### What are the ClickPipes public prices? {#what-are-the-clickpipes-public-prices}
479483

480-
- Compute: \$0.20 per unit per hour (\$0.05 per replica per hour)
484+
- Compute: \$0.20 per unit per hour (\$0.0125 per replica per hour for the default replica size)
481485
- Ingested data: \$0.04 per GB
482486

487+
The price for the Compute dimension depends on the **number** and **size** of replica(s) in a ClickPipe. The default replica size can be adjusted using vertical scaling, and each replica size is priced as follows:
488+
489+
| Replica Size | Compute Units | RAM | vCPU | Price per Hour |
490+
|----------------------------|---------------|---------|--------|----------------|
491+
| Extra Small (XS) (default) | 0.0625 | 512 MiB | 0.125. | $0.0125 |
492+
| Small (S) | 0.125 | 1 GiB | 0.25 | $0.025 |
493+
| Medium (M) | 0.25 | 2 GiB | 0.5 | $0.05 |
494+
| Large (L) | 0.5 | 4 GiB | 1.0 | $0.10 |
495+
| Extra Large (XL) | 1.0 | 8 GiB | 2.0 | $0.20 |
496+
483497
#### How does it look in an illustrative example? {#how-does-it-look-in-an-illustrative-example}
484498

485-
The following examples assume a single replica unless explicitly mentioned.
499+
The following examples assume a single M-sized replica, unless explicitly mentioned.
486500

487501
<table><thead>
488502
<tr>

docs/cloud/manage/jan2025_faq/_snippets/_clickpipes_faq.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,9 @@ This corresponds to **0.25** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCP
5252

5353
<summary>Can ClickPipes replicas be scaled?</summary>
5454

55-
ClickPipes for streaming can be scaled horizontally
56-
by adding more replicas each with a base unit of **0.25** ClickHouse compute units.
57-
Vertical scaling is also available on demand for specific use cases (adding more CPU and RAM per replica).
55+
Yes, ClickPipes for streaming can be scaled both horizontally and vertically.
56+
Horizontal scaling adds more replicas to increase throughput, while vertical scaling increases the resources (CPU and RAM) allocated to each replica to handle more intensive workloads.
57+
This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
5858

5959
</details>
6060

@@ -142,4 +142,4 @@ The philosophy behind ClickPipes pricing is
142142
to cover the operating costs of the platform while offering an easy and reliable way to move data to ClickHouse Cloud.
143143
From that angle, our market analysis revealed that we are positioned competitively.
144144

145-
</details>
145+
</details>

docs/cloud/reference/supported-regions.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,8 @@ import EnterprisePlanFeatureBadge from '@theme/badges/EnterprisePlanFeatureBadge
5555

5656
**Private Region:**
5757

58-
JapanEast
58+
- JapanEast
59+
5960
:::note
6061
Need to deploy to a region not currently listed? [Submit a request](https://clickhouse.com/pricing?modal=open).
6162
:::

docs/integrations/data-ingestion/clickpipes/kafka/04_best_practices.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,12 +121,28 @@ ClickPipes does not provide any guarantees concerning latency. If you have speci
121121

122122
### Scaling {#scaling}
123123

124-
ClickPipes for Kafka is designed to scale horizontally. By default, we create a consumer group with one consumer.
125-
This can be changed with the scaling controls in the ClickPipe details view.
124+
ClickPipes for Kafka is designed to scale horizontally and vertically. By default, we create a consumer group with one consumer. This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
126125

127126
ClickPipes provides a high-availability with an availability zone distributed architecture.
128127
This requires scaling to at least two consumers.
129128

130129
Regardless number of running consumers, fault tolerance is available by design.
131130
If a consumer or its underlying infrastructure fails,
132131
the ClickPipe will automatically restart the consumer and continue processing messages.
132+
133+
### Benchmarks {#benchmarks}
134+
135+
Below are some informal benchmarks for ClickPipes for Kafka that can be used to get a general idea of the baseline performance. It's important to know that many factors can impact performance, including message size, data types, and data format. Your mileage may vary, and what we show here is not a guarantee of actual performance.
136+
137+
Benchmark details:
138+
139+
- We used production ClickHouse Cloud services with enough resources to ensure that throughput was not bottlenecked by the insert processing on the ClickHouse side.
140+
- The ClickHouse Cloud service, the Kafka cluster (Confluent Cloud), and the ClickPipe were all running in the same region (`us-east-2`).
141+
- The ClickPipe was configured with a single L-sized replica (4 GiB of RAM and 1 vCPU).
142+
- The sample data included nested data with a mix of `UUID`, `String`, and `Int` datatypes. Other datatypes, such as `Float`, `Decimal`, and `DateTime`, may be less performant.
143+
- There was no appreciable difference in performance using compressed and uncompressed data.
144+
145+
| Replica Size | Message Size | Data Format | Throughput |
146+
|---------------|--------------|-------------|------------|
147+
| Large (L) | 1.6kb | JSON | 63mb/s |
148+
| Large (L) | 1.6kb | Avro | 99mb/s |

docs/integrations/data-ingestion/clickpipes/kafka/05_faq.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,15 @@ No, the ClickPipes for Kafka is designed for reading data from Kafka topics, not
5454
Yes, if the brokers are part of the same quorum they can be configured together delimited with `,`.
5555
</details>
5656

57+
<details>
58+
59+
<summary>Can ClickPipes replicas be scaled?</summary>
60+
61+
Yes, ClickPipes for streaming can be scaled both horizontally and vertically.
62+
Horizontal scaling adds more replicas to increase throughput, while vertical scaling increases the resources (CPU and RAM) allocated to each replica to handle more intensive workloads.
63+
This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
64+
</details>
65+
5766
### Upstash {#upstash}
5867

5968
<details>

docs/integrations/data-ingestion/clickpipes/kinesis.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -159,10 +159,9 @@ If you have specific low-latency requirements, please [contact us](https://click
159159

160160
### Scaling {#scaling}
161161

162-
ClickPipes for Kinesis is designed to scale horizontally. By default, we create a consumer group with one consumer.
163-
This can be changed with the scaling controls in the ClickPipe details view.
162+
ClickPipes for Kinesis is designed to scale both horizontally and vertically. By default, we create a consumer group with one consumer. This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
164163

165-
ClickPipes provides a high-availability with an availability zone distributed architecture.
164+
ClickPipes provides high-availability with an availability zone distributed architecture.
166165
This requires scaling to at least two consumers.
167166

168167
Regardless number of running consumers, fault tolerance is available by design.

docs/use-cases/AI_ML/MCP/07_janai.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
---
2+
slug: /use-cases/AI/MCP/janai
3+
sidebar_label: 'Integrate Jan.ai'
4+
title: 'Set Up ClickHouse MCP Server with Jan.ai'
5+
pagination_prev: null
6+
pagination_next: null
7+
description: 'This guide explains how to set up Jan.ai with a ClickHouse MCP server.'
8+
keywords: ['AI', 'Jan.ai', 'MCP']
9+
show_related_blogs: true
10+
---
11+
12+
import {CardHorizontal} from '@clickhouse/click-ui/bundled'
13+
import Link from '@docusaurus/Link';
14+
import Image from '@theme/IdealImage';
15+
16+
import OpenAIModels from '@site/static/images/use-cases/AI_ML/MCP/0_janai_openai.png';
17+
import MCPServers from '@site/static/images/use-cases/AI_ML/MCP/1_janai_mcp_servers.png';
18+
import MCPServersList from '@site/static/images/use-cases/AI_ML/MCP/2_janai_mcp_servers_list.png';
19+
import MCPForm from '@site/static/images/use-cases/AI_ML/MCP/3_janai_add_mcp_server.png';
20+
import MCPEnabled from '@site/static/images/use-cases/AI_ML/MCP/4_janai_toggle.png';
21+
import MCPTool from '@site/static/images/use-cases/AI_ML/MCP/5_jani_tools.png';
22+
import Question from '@site/static/images/use-cases/AI_ML/MCP/6_janai_question.png';
23+
import MCPToolConfirm from '@site/static/images/use-cases/AI_ML/MCP/7_janai_tool_confirmation.png';
24+
import ToolsCalled from '@site/static/images/use-cases/AI_ML/MCP/8_janai_tools_called.png';
25+
import ToolsCalledExpanded from '@site/static/images/use-cases/AI_ML/MCP/9_janai_tools_called_expanded.png';
26+
import Result from '@site/static/images/use-cases/AI_ML/MCP/10_janai_result.png';
27+
28+
# Using ClickHouse MCP server with Jan.ai
29+
30+
> This guide explains how to use the ClickHouse MCP Server with [Jan.ai](https://jan.ai/docs).
31+
32+
<VerticalStepper headerLevel="h2">
33+
34+
## Install Jan.ai {#install-janai}
35+
36+
Jan.ai is an open source ChatGPT-alternative that runs 100% offline.
37+
You can download Jan.ai for [Mac](https://jan.ai/docs/desktop/mac), [Windows](https://jan.ai/docs/desktop/windows), or [Linux](https://jan.ai/docs/desktop/linux).
38+
39+
It's a native app, so once it's downloaded, you can launch it.
40+
41+
## Add LLM to Jan.ai {#add-llm-to-janai}
42+
43+
We can enabled models via the settings menu.
44+
45+
To enable OpenAI, we need to provide an API key, as shown below:
46+
47+
<Image img={OpenAIModels} alt="Enable OpenAI models" size="md"/>
48+
49+
## Enable MCP Servers {#enable-mcp-servers}
50+
51+
At the time of writing, MCP Servers are an experimental feature in Jan.ai.
52+
We can enable them by toggling experimental features:
53+
54+
<Image img={MCPServers} alt="Enable MCP servers" size="md"/>
55+
56+
Once that toggle is pressed, we'll see `MCP Servers` on the left menu.
57+
58+
## Configure ClickHouse MCP Server {#configure-clickhouse-mcp-server}
59+
60+
If we click on the `MCP Servers` menu, we'll see a list of MCP servers that we can connect to:
61+
62+
<Image img={MCPServersList} alt="MCP servers list" size="md"/>
63+
64+
There servers are all disabled by default, but we can able them by clicking the toggle.
65+
66+
To install the ClickHouse MCP Server, we need to click on the `+` icon and then populate the form with the following:
67+
68+
<Image img={MCPForm} alt="Add MCP server" size="md"/>
69+
70+
Once we've done that, we'll need to toggle the ClickHouse Server if it's not already toggled:
71+
72+
<Image img={MCPEnabled} alt="Enable MCP server" size="md"/>
73+
74+
The ClickHouse MCP Server's tools will now be visible on the chat dialog:
75+
76+
<Image img={MCPTool} alt="ClickHouse MCP Server tools" size="md"/>
77+
78+
## Chat to ClickHouse MCP Server with Jan.ai {#chat-to-clickhouse-mcp-server}
79+
80+
It's time to have a conversation about some data stored in ClickHouse!
81+
Let's ask a question:
82+
83+
<Image img={Question} alt="Question" size="md"/>
84+
85+
Jan.ai will ask confirmation before calling a tool:
86+
87+
<Image img={MCPToolConfirm} alt="Tool confirmation" size="md"/>
88+
89+
It will then show us the list of tool calls that were made:
90+
91+
<Image img={ToolsCalled} alt="Tools called" size="md"/>
92+
93+
If we click on the tool call, we can see the details of the call:
94+
95+
<Image img={ToolsCalledExpanded} alt="Tools called expanded" size="md"/>
96+
97+
And then underneath, we have our result:
98+
99+
<Image img={Result} alt="Result" size="md"/>
100+
101+
</VerticalStepper>
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: 'Memory limit exceeded for query'
3+
description: 'Troubleshooting memory limit exceeded errors for a query'
4+
date: 2025-07-25
5+
tags: ['Errors and Exceptions']
6+
keywords: ['OOM', 'memory limit exceeded']
7+
---
8+
9+
{frontMatter.description}
10+
{/* truncate */}
11+
12+
import Image from '@theme/IdealImage';
13+
import joins from '@site/static/images/knowledgebase/memory-limit-exceeded-for-query.png';
14+
15+
## Memory limit exceeded for query {#troubleshooting-out-of-memory-issues}
16+
17+
As a new user, ClickHouse can often seem like magic - every query is super fast,
18+
even on the largest datasets and most ambitious queries. Invariably though,
19+
real-world usage tests even the limits of ClickHouse. Queries exceeding memory
20+
can be the result of a number of causes. Most commonly, we see large joins or
21+
aggregations on high cardinality fields. If performance is critical, and these
22+
queries are required, we often recommend users simply scale up - something
23+
ClickHouse Cloud does automatically and effortlessly to ensure your queries
24+
remain responsive. We appreciate, however, that in self-managed scenarios,
25+
this is sometimes not trivial, and maybe optimal performance is not even required.
26+
Users, in this case, have a few options.
27+
28+
### Aggregations {#aggregations}
29+
30+
For memory-intensive aggregations or sorting scenarios, users can use the settings
31+
[`max_bytes_before_external_group_by`](/operations/settings/settings#max_bytes_before_external_group_by)
32+
and [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort) respectively.
33+
The former of which is discussed extensively [here](/sql-reference/statements/select/group-by/#group-by-in-external-memory).
34+
35+
In summary, this ensures any aggregations can “spill” out to disk if a memory
36+
threshold is exceeded. This will invariably impact query performance but will
37+
help ensure queries do not OOM. The latter sorting setting helps address similar
38+
issues with memory-intensive sorts. This can be particularly important in
39+
distributed environments where a coordinating node receives sorted responses
40+
from child shards. In this case, the coordinating server can be asked to sort a
41+
dataset larger than its available memory. With [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort),
42+
sorting can be allowed to spill over to disk. This setting is also helpful for
43+
cases where the user has an `ORDER BY` after a `GROUP BY` with a `LIMIT`,
44+
especially in cases where the query is distributed.
45+
46+
### Joins {#joins}
47+
48+
For joins, users can select different `JOIN` algorithms, which can assist in
49+
lowering the required memory. By default, joins use the hash join, which offers
50+
the most completeness with respect to features and often the best performance.
51+
This algorithm loads the right-hand table of the `JOIN` into an in-memory hash
52+
table, against which the left-hand table is then evaluated. To minimize memory,
53+
users should thus place the smaller table on the right side. This approach still
54+
has limitations in memory-bound cases, however. In these cases, `partial_merge`
55+
join can be enabled via the [`join_algorithm`](/operations/settings/settings#join_algorithm)
56+
setting. This derivative of the [sort-merge algorithm](https://en.wikipedia.org/wiki/Sort-merge_join),
57+
first sorts the right table into blocks and creates a min-max index for them.
58+
It then sorts parts of the left table by the join key and joins them over the
59+
right table. The min-max index is used to skip unneeded right table blocks.
60+
This is less memory-intensive at the expense of performance. Taking this concept
61+
further, the `full_sorting_merge` algorithm allows a `JOIN` to be performed when
62+
the right-hand side is very large and doesn't fit into memory and lookups are
63+
impossible, e.g. a complex subquery. In this case, both the right and left side
64+
are sorted on disk if they do not fit in memory, allowing large tables to be
65+
joined.
66+
67+
<Image img={joins} size="md" alt="Joins algorithms"/>
68+
69+
Since 20.3, ClickHouse has supported an auto value for the `join_algorithm` setting.
70+
This instructs ClickHouse to apply an adaptive join approach, where the hash-join
71+
algorithm is preferred until memory limits are violated, at which point the
72+
partial_merge algorithm is attempted. Finally, concerning joins, we encourage
73+
readers to be aware of the behavior of distributed joins and how to minimize
74+
their memory consumption. More information can be found [here](/sql-reference/operators/in#distributed-subqueries).
75+
76+

plugins/floating-pages-exceptions.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,3 @@
1010
integrations/language-clients/java/client-v1
1111
integrations/language-clients/java/jdbc-v1
1212
integrations/data-ingestion/clickpipes/postgres/maintenance.md
13-
operations/settings/tcp-connection-limits.md

0 commit comments

Comments
 (0)