Skip to content

Commit fb83b7e

Browse files
committed
Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs into related_blog_component
2 parents 05e2448 + 0abaddd commit fb83b7e

File tree

56 files changed

+284
-1096
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+284
-1096
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -171,28 +171,28 @@ Have you noticed a typo or found some wonky formatting? For small contributions
171171
172172
1. Each page in Clickhouse.com/docs has an **Edit this page** link at the top:
173173
174-
![The ClickHouse Docs website with the edit button highlighted.](./images/readme-edit-this-page.png)
174+
![The ClickHouse Docs website with the edit button highlighted.](./static/images/contribute/readme-edit-this-page.png)
175175
176176
Click this button to edit this page in GitHub.
177177
178178
1. Once you're in GitHub, click the pencil icon to edit this page:
179179

180-
![README Pencil Icon](./images/readme-pencil-icon.png)
180+
![README Pencil Icon](./static/images/contribute/readme-pencil-icon.png)
181181

182182
1. GitHub will _fork_ the repository for you. This creates a copy of the `clickhouse-docs` repository on your personal GitHub account.
183183
1. Make your changes in the textbox. Once you're done, click **Commit changes**:
184184
185-
![README Commit Changes](./images/readme-commit-changes.png)
185+
![README Commit Changes](./static/images/contribute/readme-commit-changes.png)
186186
187187
1. In the **Propose changes** popup, enter a descriptive title to explain the changes you just made. Keep this title to 10 words or less. If your changes are fairly complex and need further explanation, enter your comments into the **Extended description** field.
188188
1. Make sure **Create a new branch** is selected, and click **Propose changes**:
189189
190-
![README Propose Changes](./images/readme-propose-changes.png)
190+
![README Propose Changes](./static/images/contribute/readme-propose-changes.png)
191191
192192
1. A new page should open with a new pull request. Double-check that the title and description are accurate.
193193
1. If you've spoken to someone on the docs team about your changes, tag them into the **Reviewers** section:
194194

195-
![README Create Pull Request](./images/readme-create-pull-request.png)
195+
![README Create Pull Request](./static/images/contribute/readme-create-pull-request.png)
196196

197197
If you haven't mentioned your changes to anyone yet, leave the **Reviewers** section blank.
198198
File renamed without changes.

docs/cloud/bestpractices/usagelimits.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,20 @@ While ClickHouse is known for its speed and reliability, optimal performance is
1111
If you've run up against one of these guardrails, it's possible that you are implementing your use case in an unoptimized way. Contact our support team and we will gladly help you refine your use case to avoid exceeding the guardrails or look together at how we can increase them in a controlled manner.
1212
:::
1313

14-
- **Databases**: 1000
15-
- **Tables**: 5000
16-
- **Columns**: ∼1000 (wide format is preferred to compact)
17-
- **Partitions**: 50k
18-
- **Parts**: 100k across the entire instance
19-
- **Part size**: 150gb
20-
- **Services per organization**: 20 (soft)
21-
- **Services per warehouse**: 5 (soft)
22-
- **Low cardinality**: 10k or less
23-
- **Primary keys in a table**: 4-5 that sufficiently filter down the data
24-
- **Query concurrency**: 1000
25-
- **Batch ingest**: anything > 1M will be split by the system in 1M row blocks
14+
| Dimension | Limit |
15+
|-----------|-------|
16+
|**Databases**| 1000|
17+
|**Tables**| 5000|
18+
|**Columns**| ∼1000 (wide format is preferred to compact)|
19+
|**Partitions**| 50k|
20+
|**Parts**| 100k across the entire instance|
21+
|**Part size**| 150gb|
22+
|**Services per organization**| 20 (soft)|
23+
|**Services per warehouse**| 5 (soft)|
24+
|**Low cardinality**| 10k or less|
25+
|**Primary keys in a table**| 4-5 that sufficiently filter down the data|
26+
|**Query concurrency**| 1000|
27+
|**Batch ingest**| anything > 1M will be split by the system in 1M row blocks|
2628

2729
:::note
2830
For Single Replica Services, the maximum number of databases is restricted to 100, and the maximum number of tables is restricted to 500. In addition, storage for Basic Tier Services is limited to 1 TB.

docs/cloud/reference/byoc.md

Lines changed: 37 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ import byoc_vpcpeering4 from '@site/static/images/cloud/reference/byoc-vpcpeerin
1717
import byoc_plb from '@site/static/images/cloud/reference/byoc-plb.png';
1818
import byoc_security from '@site/static/images/cloud/reference/byoc-securitygroup.png';
1919
import byoc_inbound from '@site/static/images/cloud/reference/byoc-inbound-rule.png';
20+
import byoc_subnet_1 from '@site/static/images/cloud/reference/byoc-subnet-1.png';
21+
import byoc_subnet_2 from '@site/static/images/cloud/reference/byoc-subnet-2.png';
2022

2123
## Overview {#overview}
2224

@@ -50,9 +52,11 @@ Metrics and logs are stored within the customer's BYOC VPC. Logs are currently s
5052

5153
Customers can initiate the onboarding process by reaching out to [us](https://clickhouse.com/cloud/bring-your-own-cloud). Customers need to have a dedicated AWS account and know the region they will use. At this time, we are allowing users to launch BYOC services only in the regions that we support for ClickHouse Cloud.
5254

53-
### Prepare a Dedicated AWS Account {#prepare-a-dedicated-aws-account}
55+
### Prepare an AWS Account {#prepare-an-aws-account}
5456

55-
Customers must prepare a dedicated AWS account for hosting the ClickHouse BYOC deployment to ensure better isolation. With this and the initial organization admin email, you can contact ClickHouse support.
57+
Customers are recommended to prepare a dedicated AWS account for hosting the ClickHouse BYOC deployment to ensure better isolation. However, using a shared account and an existing VPC is also possible. See the details in *Setup BYOC Infrastructure* below.
58+
59+
With this account and the initial organization admin email, you can contact ClickHouse support.
5660

5761
### Apply CloudFormation Template {#apply-cloudformation-template}
5862

@@ -68,6 +72,36 @@ After creating the CloudFormation stack, you will be prompted to set up the infr
6872
- **The VPC CIDR range for BYOC**: By default, we use `10.0.0.0/16` for the BYOC VPC CIDR range. If you plan to use VPC peering with another account, ensure the CIDR ranges do not overlap. Allocate a proper CIDR range for BYOC, with a minimum size of `/22` to accommodate necessary workloads.
6973
- **Availability Zones for BYOC VPC**: If you plan to use VPC peering, aligning availability zones between the source and BYOC accounts can help reduce cross-AZ traffic costs. In AWS, availability zone suffixes (`a, b, c`) may represent different physical zone IDs across accounts. See the [AWS guide](https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/use-consistent-availability-zones-in-vpcs-across-different-aws-accounts.html) for details.
7074

75+
#### Customer-managed VPC {#customer-managed-vpc}
76+
By default, ClickHouse Cloud will provision a dedicated VPC for better isolation in your BYOC deployment. However, you can also use an existing VPC in your account. This requires specific configuration and must be coordinated through ClickHouse Support.
77+
78+
**Configure Your Existing VPC**
79+
1. Allocate at least 3 private subnets across 3 different availability zones for ClickHouse Cloud to use.
80+
2. Ensure each subnet has a minimum CIDR range of `/23` (e.g., 10.0.0.0/23) to provide sufficient IP addresses for the ClickHouse deployment.
81+
3. Add the tag `kubernetes.io/role/internal-elb=1` to each subnet to enable proper load balancer configuration.
82+
83+
<br />
84+
85+
<Image img={byoc_subnet_1} size="lg" alt="BYOC VPC Subnet" background='black'/>
86+
87+
<br />
88+
89+
<br />
90+
91+
<Image img={byoc_subnet_2} size="lg" alt="BYOC VPC Subnet Tags" background='black'/>
92+
93+
<br />
94+
95+
**Contact ClickHouse Support**
96+
Create a support ticket with the following information:
97+
98+
* Your AWS account ID
99+
* The AWS region where you want to deploy the service
100+
* Your VPC ID
101+
* The Private Subnet IDs you've allocated for ClickHouse
102+
* The availability zones these subnets are in
103+
104+
71105
### Optional: Setup VPC Peering {#optional-setup-vpc-peering}
72106

73107
To create or delete VPC peering for ClickHouse BYOC, follow the steps:
@@ -129,30 +163,7 @@ In the peering AWS account,
129163
<br />
130164

131165
#### Step 6 Edit Security Group to allow Peered VPC access {#step-6-edit-security-group-to-allow-peered-vpc-access}
132-
In ClickHouse BYOC account,
133-
1. In the ClickHouse BYOC account, navigate to EC2 and locate the Private Load Balancer named like infra-xx-xxx-ingress-private.
134-
135-
<br />
136-
137-
<Image img={byoc_plb} size="lg" alt="BYOC Private Load Balancer" border />
138-
139-
<br />
140-
141-
2. Under the Security tab on the Details page, find the associated Security Group, which follows a naming pattern like `k8s-istioing-istioing-xxxxxxxxx`.
142-
143-
<br />
144-
145-
<Image img={byoc_security} size="lg" alt="BYOC Private Load Balancer Security Group" border />
146-
147-
<br />
148-
149-
3. Edit the Inbound Rules of this Security Group and add the Peered VPC CIDR range (or specify the required CIDR range as needed).
150-
151-
<br />
152-
153-
<Image img={byoc_inbound} size="lg" alt="BYOC Security Group Inbound Rule" border />
154-
155-
<br />
166+
In the ClickHouse BYOC account, you need to update the Security Group settings to allow traffic from your peered VPC. Please contact ClickHouse Support to request the addition of inbound rules that include the CIDR ranges of your peered VPC.
156167

157168
---
158169
The ClickHouse service should now be accessible from the peered VPC.

docs/guides/best-practices/partitioningkey.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@ title: 'Choose a Low Cardinality Partitioning Key'
55
description: 'Use a low cardinality partitioning key or avoid using any partitioning key for your table.'
66
---
77

8-
import Content from '@site/docs/best-practices/partionning_keys.md';
8+
import Content from '@site/docs/best-practices/partitioning_keys.mdx';
99

1010
<Content />

docs/guides/best-practices/sparse-primary-indexes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1494,7 +1494,7 @@ A compromise between fastest retrieval and optimal data compression is to use a
14941494

14951495
### A concrete example {#a-concrete-example}
14961496

1497-
One concrete example is a the plaintext paste service https://pastila.nl that Alexey Milovidov developed and [blogged about](https://clickhouse.com/blog/building-a-paste-service-with-clickhouse/).
1497+
One concrete example is a the plaintext paste service [https://pastila.nl](https://pastila.nl) that Alexey Milovidov developed and [blogged about](https://clickhouse.com/blog/building-a-paste-service-with-clickhouse/).
14981498

14991499
On every change to the text-area, the data is saved automatically into a ClickHouse table row (one row per change).
15001500

docs/integrations/data-ingestion/clickpipes/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ import S3svg from '@site/static/images/integrations/logos/amazon_s3_logo.svg';
1414
import Amazonkinesis from '@site/static/images/integrations/logos/amazon_kinesis_logo.svg';
1515
import Gcssvg from '@site/static/images/integrations/logos/gcs.svg';
1616
import DOsvg from '@site/static/images/integrations/logos/digitalocean.svg';
17+
import ABSsvg from '@site/static/images/integrations/logos/azureblobstorage.svg';
1718
import Postgressvg from '@site/static/images/integrations/logos/postgresql.svg';
1819
import Mysqlsvg from '@site/static/images/integrations/logos/mysql.svg';
1920
import redpanda_logo from '@site/static/images/integrations/logos/logo_redpanda.png';
@@ -42,7 +43,7 @@ import Image from '@theme/IdealImage';
4243
| Amazon S3 | <S3svg class="image" alt="Amazon S3 logo" style={{width: '3rem', height: 'auto'}}/> |Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
4344
| Google Cloud Storage | <Gcssvg class="image" alt="Google Cloud Storage logo" style={{width: '3rem', height: 'auto'}}/> |Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
4445
| DigitalOcean Spaces | <DOsvg class="image" alt="Digital Ocean logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Stable | Configure ClickPipes to ingest large volumes of data from object storage.
45-
46+
| Azure Blob Storage | <ABSsvg class="image" alt="Azure Blob Storage logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Private Beta | Configure ClickPipes to ingest large volumes of data from object storage.
4647
| Amazon Kinesis | <Amazonkinesis class="image" alt="Amazon Kenesis logo" style={{width: '3rem', height: 'auto'}}/> |Streaming| Stable | Configure ClickPipes and start ingesting streaming data from Amazon Kinesis into ClickHouse cloud. |
4748
| Postgres | <Postgressvg class="image" alt="Postgres logo" style={{width: '3rem', height: 'auto'}}/> |DBMS| Public Beta | Configure ClickPipes and start ingesting data from Postgres into ClickHouse Cloud. |
4849
| MySQL | <Mysqlsvg class="image" alt="MySQL logo" style={{width: '3rem', height: 'auto'}}/> |DBMS| Private Beta | Configure ClickPipes and start ingesting data from MySQL into ClickHouse Cloud. |

docs/integrations/data-ingestion/clickpipes/kafka.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,8 @@ The supported formats are:
123123

124124
### Supported Data Types {#supported-data-types}
125125

126-
The following ClickHouse data types are currently supported in ClickPipes:
126+
#### Standard types support {#standard-types-support}
127+
The following standard ClickHouse data types are currently supported in ClickPipes:
127128

128129
- Base numeric types - \[U\]Int8/16/32/64 and Float32/64
129130
- Large integer types - \[U\]Int128/256
@@ -141,6 +142,28 @@ The following ClickHouse data types are currently supported in ClickPipes:
141142
- Map with keys and values using any of the above types (including Nullables)
142143
- Tuple and Array with elements using any of the above types (including Nullables, one level depth only)
143144

145+
#### Variant type support (experimental) {#variant-type-support}
146+
Variant type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
147+
have to submit a support ticket to enable it on your service.
148+
149+
ClickPipes supports the Variant type in the following circumstances:
150+
- Avro Unions. If your Avro schema contains a union with multiple non-null types, ClickPipes will infer the
151+
appropriate variant type. Variant types are not otherwise supported for Avro data.
152+
- JSON fields. You can manually specify a Variant type (such as `Variant(String, Int64, DateTime)`) for any JSON field
153+
in the source data stream. Because of the way ClickPipes determines the correct variant subtype to use, only one integer or datetime
154+
type can be used in the Variant definition - for example, `Variant(Int64, UInt32)` is not supported.
155+
156+
#### JSON type support (experimental) {#json-type-support}
157+
JSON type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
158+
have to submit a support ticket to enable it on your service.
159+
160+
ClickPipes support the JSON type in the following circumstances:
161+
- Avro Record types can always be assigned to a JSON column.
162+
- Avro String and Bytes types can be assigned to a JSON column if the column actually holds JSON String objects.
163+
- JSON fields that are always a JSON object can be assigned to a JSON destination column.
164+
165+
Note that you will have to manually change the destination column to the desired JSON type, including any fixed or skipped paths.
166+
144167
### Avro {#avro}
145168
#### Supported Avro Data Types {#supported-avro-data-types}
146169

docs/integrations/data-ingestion/clickpipes/kinesis.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ The supported formats are:
9191

9292
## Supported Data Types {#supported-data-types}
9393

94+
### Standard types support {#standard-types-support}
9495
The following ClickHouse data types are currently supported in ClickPipes:
9596

9697
- Base numeric types - \[U\]Int8/16/32/64 and Float32/64
@@ -108,6 +109,21 @@ The following ClickHouse data types are currently supported in ClickPipes:
108109
- all ClickHouse LowCardinality types
109110
- Map with keys and values using any of the above types (including Nullables)
110111
- Tuple and Array with elements using any of the above types (including Nullables, one level depth only)
112+
-
113+
### Variant type support (experimental) {#variant-type-support}
114+
Variant type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
115+
have to submit a support ticket to enable it on your service.
116+
117+
You can manually specify a Variant type (such as `Variant(String, Int64, DateTime)`) for any JSON field
118+
in the source data stream. Because of the way ClickPipes determines the correct variant subtype to use, only one integer or datetime
119+
type can be used in the Variant definition - for example, `Variant(Int64, UInt32)` is not supported.
120+
121+
### JSON type support (experimental) {#json-type-support}
122+
JSON type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
123+
have to submit a support ticket to enable it on your service.
124+
125+
JSON fields that are always a JSON object can be assigned to a JSON destination column. You will have to manually change the destination
126+
column to the desired JSON type, including any fixed or skipped paths.
111127

112128
## Kinesis Virtual Columns {#kinesis-virtual-columns}
113129

0 commit comments

Comments
 (0)