ClickHouse
diff --git a/‎docs/cloud/manage/backups/configurable-backups.md
Lines changed: 4 additions & 0 deletions b/‎docs/cloud/manage/backups/configurable-backups.md
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/cloud/manage/billing.md
Lines changed: 21 additions & 7 deletions b/‎docs/cloud/manage/billing.md
Lines changed: 21 additions & 7 deletions
diff --git a/‎docs/cloud/manage/jan2025_faq/_snippets/_clickpipes_faq.md
Lines changed: 4 additions & 4 deletions b/‎docs/cloud/manage/jan2025_faq/_snippets/_clickpipes_faq.md
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/cloud/reference/supported-regions.md
Lines changed: 2 additions & 1 deletion b/‎docs/cloud/reference/supported-regions.md
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/clickpipes/kafka/04_best_practices.md
Lines changed: 18 additions & 2 deletions b/‎docs/integrations/data-ingestion/clickpipes/kafka/04_best_practices.md
Lines changed: 18 additions & 2 deletions
diff --git a/‎docs/integrations/data-ingestion/clickpipes/kafka/05_faq.md
Lines changed: 9 additions & 0 deletions b/‎docs/integrations/data-ingestion/clickpipes/kafka/05_faq.md
Lines changed: 9 additions & 0 deletions
diff --git a/‎docs/integrations/data-ingestion/clickpipes/kinesis.md
Lines changed: 2 additions & 3 deletions b/‎docs/integrations/data-ingestion/clickpipes/kinesis.md
Lines changed: 2 additions & 3 deletions
diff --git a/‎docs/use-cases/AI_ML/MCP/07_janai.md
Lines changed: 101 additions & 0 deletions b/‎docs/use-cases/AI_ML/MCP/07_janai.md
Lines changed: 101 additions & 0 deletions
diff --git a/‎knowledgebase/memory-limit-exceeded-for-query.mdx
Lines changed: 76 additions & 0 deletions b/‎knowledgebase/memory-limit-exceeded-for-query.mdx
Lines changed: 76 additions & 0 deletions
diff --git a/‎plugins/floating-pages-exceptions.txt
Lines changed: 0 additions & 1 deletion b/‎plugins/floating-pages-exceptions.txt
Lines changed: 0 additions & 1 deletion
@@ -24,6 +24,10 @@ ClickHouse Cloud allows you to configure the schedule for your backups for **Sca
 The custom schedule will override the default backup policy in ClickHouse Cloud for your given service.
 :::
 
+:::note
+In some rare scenarios, the backup scheduler will not respect the **Start Time** specified for backups. Specifically, this happens if there was a successful backup triggered < 24 hours from the time of the currently scheduled backup. This could happen due to a retry mechanism we have in place for backups. In such instances, the scheduler will skip over the backup for the current day, and will retry the backup the next day at the scheduled time. 
+:::
+
 To configure the backup schedule for a service, go to the **Settings** tab in the console and click on **Change backup configuration**.
 
 <Image img={backup_settings} size="lg" alt="Configure backup settings" border/>
 
@@ -223,6 +223,10 @@ ClickHouse Cloud supports the following billing options:
 - Direct-sales annual / multi-year (through pre-paid "ClickHouse Credits", in USD, with additional payment options).
 - Through the AWS, GCP, and Azure marketplaces (either pay-as-you-go (PAYG) or commit to a contract with ClickHouse Cloud through the marketplace).
 
+:::note
+ClickHouse Cloud credits for PAYG are invoiced in \$0.01 units, allowing us to charge customers for partial ClickHouse credits based on their usage. This differs from committed spend ClickHouse credits, which are purchased in advance in whole \$1 units.
+:::
+
 ### How long is the billing cycle? {#how-long-is-the-billing-cycle}
 
 Billing follows a monthly billing cycle and the start date is tracked as the date when the ClickHouse Cloud organization was created.
@@ -454,12 +458,12 @@ This section outlines the pricing model of ClickPipes for streaming and object s
 
 #### What does the ClickPipes pricing structure look like? {#what-does-the-clickpipes-pricing-structure-look-like}
 
-It consists of two dimensions
+It consists of two dimensions:
 
-- **Compute**: Price per unit per hour
+- **Compute**: Price **per unit per hour**.
   Compute represents the cost of running the ClickPipes replica pods whether they actively ingest data or not.
   It applies to all ClickPipes types.
-- **Ingested data**: per GB pricing
+- **Ingested data**: Price **per GB**.
   The ingested data rate applies to all streaming ClickPipes
   (Kafka, Confluent, Amazon MSK, Amazon Kinesis, Redpanda, WarpStream, Azure Event Hubs)
   for the data transferred via the replica pods. The ingested data size (GB) is charged based on bytes received from the source (uncompressed or compressed).
@@ -472,17 +476,27 @@ For this reason, it uses dedicated compute replicas.
 
 #### What is the default number of replicas and their size? {#what-is-the-default-number-of-replicas-and-their-size}
 
-Each ClickPipe defaults to 1 replica that is provided with 2 GiB of RAM and 0.5 vCPU.
-This corresponds to **0.25** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCPUs).
+Each ClickPipe defaults to 1 replica that is provided with 512 MiB of RAM and 0.125 vCPU (XS).
+This corresponds to **0.0625** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCPUs).
 
 #### What are the ClickPipes public prices? {#what-are-the-clickpipes-public-prices}
 
-- Compute: \$0.20 per unit per hour (\$0.05 per replica per hour)
+- Compute: \$0.20 per unit per hour (\$0.0125 per replica per hour for the default replica size)
 - Ingested data: \$0.04 per GB
 
+The price for the Compute dimension depends on the **number** and **size** of replica(s) in a ClickPipe. The default replica size can be adjusted using vertical scaling, and each replica size is priced as follows:
+
+| Replica Size               | Compute Units | RAM     | vCPU   | Price per Hour |
+|----------------------------|---------------|---------|--------|----------------|
+| Extra Small (XS) (default) | 0.0625        | 512 MiB | 0.125. | $0.0125        |
+| Small (S)                  | 0.125         | 1 GiB   | 0.25   | $0.025         |
+| Medium (M)                 | 0.25          | 2 GiB   | 0.5    | $0.05          |
+| Large (L)                  | 0.5           | 4 GiB   | 1.0    | $0.10          |
+| Extra Large (XL)           | 1.0           | 8 GiB   | 2.0    | $0.20          |
+
 #### How does it look in an illustrative example? {#how-does-it-look-in-an-illustrative-example}
 
-The following examples assume a single replica unless explicitly mentioned.
+The following examples assume a single M-sized replica, unless explicitly mentioned.
 
 <table><thead>
   <tr>
 
@@ -52,9 +52,9 @@ This corresponds to **0.25** ClickHouse compute units (1 unit = 8 GiB RAM, 2 vCP
 
 <summary>Can ClickPipes replicas be scaled?</summary>
 
-ClickPipes for streaming can be scaled horizontally
-by adding more replicas each with a base unit of **0.25** ClickHouse compute units.
-Vertical scaling is also available on demand for specific use cases (adding more CPU and RAM per replica).
+Yes, ClickPipes for streaming can be scaled both horizontally and vertically.
+Horizontal scaling adds more replicas to increase throughput, while vertical scaling increases the resources (CPU and RAM) allocated to each replica to handle more intensive workloads.
+This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
 
 </details>
 
@@ -142,4 +142,4 @@ The philosophy behind ClickPipes pricing is
 to cover the operating costs of the platform while offering an easy and reliable way to move data to ClickHouse Cloud.
 From that angle, our market analysis revealed that we are positioned competitively.
 
-</details>
+</details>
@@ -55,7 +55,8 @@ import EnterprisePlanFeatureBadge from '@theme/badges/EnterprisePlanFeatureBadge
 
 **Private Region:**
 
-JapanEast
+- JapanEast
+
 :::note 
 Need to deploy to a region not currently listed? [Submit a request](https://clickhouse.com/pricing?modal=open). 
 :::
 
@@ -121,12 +121,28 @@ ClickPipes does not provide any guarantees concerning latency. If you have speci
 
 ### Scaling {#scaling}
 
-ClickPipes for Kafka is designed to scale horizontally. By default, we create a consumer group with one consumer.
-This can be changed with the scaling controls in the ClickPipe details view.
+ClickPipes for Kafka is designed to scale horizontally and vertically. By default, we create a consumer group with one consumer. This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
 
 ClickPipes provides a high-availability with an availability zone distributed architecture.
 This requires scaling to at least two consumers.
 
 Regardless number of running consumers, fault tolerance is available by design.
 If a consumer or its underlying infrastructure fails,
 the ClickPipe will automatically restart the consumer and continue processing messages.
+
+### Benchmarks {#benchmarks}
+
+Below are some informal benchmarks for ClickPipes for Kafka that can be used to get a general idea of the baseline performance. It's important to know that many factors can impact performance, including message size, data types, and data format. Your mileage may vary, and what we show here is not a guarantee of actual performance.
+
+Benchmark details:
+
+- We used production ClickHouse Cloud services with enough resources to ensure that throughput was not bottlenecked by the insert processing on the ClickHouse side.
+- The ClickHouse Cloud service, the Kafka cluster (Confluent Cloud), and the ClickPipe were all running in the same region (`us-east-2`).
+- The ClickPipe was configured with a single L-sized replica (4 GiB of RAM and 1 vCPU).
+- The sample data included nested data with a mix of `UUID`, `String`, and `Int` datatypes. Other datatypes, such as `Float`, `Decimal`, and `DateTime`, may be less performant.
+- There was no appreciable difference in performance using compressed and uncompressed data.
+
+| Replica Size  | Message Size | Data Format | Throughput |
+|---------------|--------------|-------------|------------|
+| Large (L)     | 1.6kb        |   JSON      | 63mb/s     |
+| Large (L)     | 1.6kb        |   Avro      | 99mb/s     |
@@ -54,6 +54,15 @@ No, the ClickPipes for Kafka is designed for reading data from Kafka topics, not
 Yes, if the brokers are part of the same quorum they can be configured together delimited with `,`.
 </details>
 
+<details>
+
+<summary>Can ClickPipes replicas be scaled?</summary>
+
+Yes, ClickPipes for streaming can be scaled both horizontally and vertically.
+Horizontal scaling adds more replicas to increase throughput, while vertical scaling increases the resources (CPU and RAM) allocated to each replica to handle more intensive workloads.
+This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
+</details>
+
 ### Upstash {#upstash}
 
 <details>
 
@@ -159,10 +159,9 @@ If you have specific low-latency requirements, please [contact us](https://click
 
 ### Scaling {#scaling}
 
-ClickPipes for Kinesis is designed to scale horizontally. By default, we create a consumer group with one consumer.
-This can be changed with the scaling controls in the ClickPipe details view.
+ClickPipes for Kinesis is designed to scale both horizontally and vertically. By default, we create a consumer group with one consumer. This can be configured during ClickPipe creation, or at any other point under **Settings** -> **Advanced Settings** -> **Scaling**.
 
-ClickPipes provides a high-availability with an availability zone distributed architecture.
+ClickPipes provides high-availability with an availability zone distributed architecture.
 This requires scaling to at least two consumers.
 
 Regardless number of running consumers, fault tolerance is available by design.
 
@@ -0,0 +1,101 @@
+---
+slug: /use-cases/AI/MCP/janai
+sidebar_label: 'Integrate Jan.ai'
+title: 'Set Up ClickHouse MCP Server with Jan.ai'
+pagination_prev: null
+pagination_next: null
+description: 'This guide explains how to set up Jan.ai with a ClickHouse MCP server.'
+keywords: ['AI', 'Jan.ai', 'MCP']
+show_related_blogs: true
+---
+
+import {CardHorizontal} from '@clickhouse/click-ui/bundled'
+import Link from '@docusaurus/Link';
+import Image from '@theme/IdealImage';
+
+import OpenAIModels from '@site/static/images/use-cases/AI_ML/MCP/0_janai_openai.png';
+import MCPServers from '@site/static/images/use-cases/AI_ML/MCP/1_janai_mcp_servers.png';
+import MCPServersList from '@site/static/images/use-cases/AI_ML/MCP/2_janai_mcp_servers_list.png';
+import MCPForm from '@site/static/images/use-cases/AI_ML/MCP/3_janai_add_mcp_server.png';
+import MCPEnabled from '@site/static/images/use-cases/AI_ML/MCP/4_janai_toggle.png';
+import MCPTool from '@site/static/images/use-cases/AI_ML/MCP/5_jani_tools.png';
+import Question from '@site/static/images/use-cases/AI_ML/MCP/6_janai_question.png';
+import MCPToolConfirm from '@site/static/images/use-cases/AI_ML/MCP/7_janai_tool_confirmation.png';
+import ToolsCalled from '@site/static/images/use-cases/AI_ML/MCP/8_janai_tools_called.png';  
+import ToolsCalledExpanded from '@site/static/images/use-cases/AI_ML/MCP/9_janai_tools_called_expanded.png';  
+import Result from '@site/static/images/use-cases/AI_ML/MCP/10_janai_result.png';  
+
+# Using ClickHouse MCP server with Jan.ai
+
+> This guide explains how to use the ClickHouse MCP Server with [Jan.ai](https://jan.ai/docs).
+
+<VerticalStepper headerLevel="h2">
+
+## Install Jan.ai {#install-janai}
+
+Jan.ai is an open source ChatGPT-alternative that runs 100% offline.
+You can download Jan.ai for [Mac](https://jan.ai/docs/desktop/mac), [Windows](https://jan.ai/docs/desktop/windows), or [Linux](https://jan.ai/docs/desktop/linux).
+
+It's a native app, so once it's downloaded, you can launch it.
+
+## Add LLM to Jan.ai {#add-llm-to-janai}
+
+We can enabled models via the settings menu. 
+
+To enable OpenAI, we need to provide an API key, as shown below:
+
+<Image img={OpenAIModels} alt="Enable OpenAI models" size="md"/>
+
+## Enable MCP Servers {#enable-mcp-servers}
+
+At the time of writing, MCP Servers are an experimental feature in Jan.ai.
+We can enable them by toggling experimental features:
+
+<Image img={MCPServers} alt="Enable MCP servers" size="md"/>
+
+Once that toggle is pressed, we'll see `MCP Servers` on the left menu.
+
+## Configure ClickHouse MCP Server {#configure-clickhouse-mcp-server}
+
+If we click on the `MCP Servers` menu, we'll see a list of MCP servers that we can connect to:
+
+<Image img={MCPServersList} alt="MCP servers list" size="md"/>
+
+There servers are all disabled by default, but we can able them by clicking the toggle.
+
+To install the ClickHouse MCP Server, we need to click on the `+` icon and then populate the form with the following:
+
+<Image img={MCPForm} alt="Add MCP server" size="md"/>
+
+Once we've done that, we'll need to toggle the ClickHouse Server if it's not already toggled:
+
+<Image img={MCPEnabled} alt="Enable MCP server" size="md"/>
+
+The ClickHouse MCP Server's tools will now be visible on the chat dialog:
+
+<Image img={MCPTool} alt="ClickHouse MCP Server tools" size="md"/>
+
+## Chat to ClickHouse MCP Server with Jan.ai {#chat-to-clickhouse-mcp-server}
+
+It's time to have a conversation about some data stored in ClickHouse!
+Let's ask a question:
+
+<Image img={Question} alt="Question" size="md"/>
+
+Jan.ai will ask confirmation before calling a tool:
+
+<Image img={MCPToolConfirm} alt="Tool confirmation" size="md"/>
+
+It will then show us the list of tool calls that were made:
+
+<Image img={ToolsCalled} alt="Tools called" size="md"/>
+
+If we click on the tool call, we can see the details of the call:
+
+<Image img={ToolsCalledExpanded} alt="Tools called expanded" size="md"/>    
+
+And then underneath, we have our result:
+
+<Image img={Result} alt="Result" size="md"/>    
+
+</VerticalStepper>
@@ -0,0 +1,76 @@
+---
+title: 'Memory limit exceeded for query'
+description: 'Troubleshooting memory limit exceeded errors for a query'
+date: 2025-07-25
+tags: ['Errors and Exceptions']
+keywords: ['OOM', 'memory limit exceeded']
+---
+
+{frontMatter.description}
+{/* truncate */}
+
+import Image from '@theme/IdealImage';
+import joins from '@site/static/images/knowledgebase/memory-limit-exceeded-for-query.png';
+
+## Memory limit exceeded for query {#troubleshooting-out-of-memory-issues}
+
+As a new user, ClickHouse can often seem like magic - every query is super fast,
+even on the largest datasets and most ambitious queries. Invariably though,
+real-world usage tests even the limits of ClickHouse. Queries exceeding memory
+can be the result of a number of causes. Most commonly, we see large joins or
+aggregations on high cardinality fields. If performance is critical, and these
+queries are required, we often recommend users simply scale up - something
+ClickHouse Cloud does automatically and effortlessly to ensure your queries
+remain responsive. We appreciate, however, that in self-managed scenarios,
+this is sometimes not trivial, and maybe optimal performance is not even required.
+Users, in this case, have a few options.
+
+### Aggregations {#aggregations}
+
+For memory-intensive aggregations or sorting scenarios, users can use the settings
+[`max_bytes_before_external_group_by`](/operations/settings/settings#max_bytes_before_external_group_by)
+and [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort) respectively.
+The former of which is discussed extensively [here](/sql-reference/statements/select/group-by/#group-by-in-external-memory).
+
+In summary, this ensures any aggregations can “spill” out to disk if a memory
+threshold is exceeded. This will invariably impact query performance but will
+help ensure queries do not OOM. The latter sorting setting helps address similar
+issues with memory-intensive sorts. This can be particularly important in
+distributed environments where a coordinating node receives sorted responses
+from child shards. In this case, the coordinating server can be asked to sort a
+dataset larger than its available memory. With [`max_bytes_before_external_sort`](/operations/settings/settings#max_bytes_ratio_before_external_sort),
+sorting can be allowed to spill over to disk. This setting is also helpful for
+cases where the user has an `ORDER BY` after a `GROUP BY` with a `LIMIT`,
+especially in cases where the query is distributed.
+
+### Joins {#joins}
+
+For joins, users can select different `JOIN` algorithms, which can assist in
+lowering the required memory. By default, joins use the hash join, which offers
+the most completeness with respect to features and often the best performance.
+This algorithm loads the right-hand table of the `JOIN` into an in-memory hash
+table, against which the left-hand table is then evaluated. To minimize memory,
+users should thus place the smaller table on the right side. This approach still
+has limitations in memory-bound cases, however. In these cases, `partial_merge`
+join can be enabled via the [`join_algorithm`](/operations/settings/settings#join_algorithm)
+setting. This derivative of the [sort-merge algorithm](https://en.wikipedia.org/wiki/Sort-merge_join),
+first sorts the right table into blocks and creates a min-max index for them.
+It then sorts parts of the left table by the join key and joins them over the
+right table. The min-max index is used to skip unneeded right table blocks.
+This is less memory-intensive at the expense of performance. Taking this concept
+further, the `full_sorting_merge` algorithm allows a `JOIN` to be performed when
+the right-hand side is very large and doesn't fit into memory and lookups are
+impossible, e.g. a complex subquery. In this case, both the right and left side
+are sorted on disk if they do not fit in memory, allowing large tables to be
+joined.
+
+<Image img={joins} size="md" alt="Joins algorithms"/>
+
+Since 20.3, ClickHouse has supported an auto value for the `join_algorithm` setting.
+This instructs ClickHouse to apply an adaptive join approach, where the hash-join
+algorithm is preferred until memory limits are violated, at which point the
+partial_merge algorithm is attempted. Finally, concerning joins, we encourage
+readers to be aware of the behavior of distributed joins and how to minimize
+their memory consumption. More information can be found [here](/sql-reference/operators/in#distributed-subqueries).
+
+
@@ -10,4 +10,3 @@
 integrations/language-clients/java/client-v1
 integrations/language-clients/java/jdbc-v1
 integrations/data-ingestion/clickpipes/postgres/maintenance.md
-operations/settings/tcp-connection-limits.md