Skip to content

Commit e3c8e6f

Browse files
authored
Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs into billing_non_payment_remediation
2 parents aaf68f5 + 2ca1b50 commit e3c8e6f

File tree

381 files changed

+6512
-1890
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

381 files changed

+6512
-1890
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,10 @@ docs/getting-started/index.md
6262
docs/data-modeling/projections/index.md
6363
docs/cloud/manage/jan2025_faq/index.md
6464
docs/chdb/guides/index.md
65+
docs/use-cases/AI_ML/index.md
66+
docs/use-cases/AI_ML/MCP/index.md
67+
docs/use-cases/AI_ML/MCP/ai_agent_libraries/index.md
68+
docs/integrations/data-ingestion/clickpipes/kafka/index.md
6569

6670
.vscode
6771
.aspell.en.prepl

README.md

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
<div align=center>
22

3-
![Website](https://img.shields.io/website?up_message=AVAILABLE&down_message=DOWN&url=https%3A%2F%2Fclickhouse.com%2Fdocs&style=for-the-badge)
3+
[![Website](https://img.shields.io/website?up_message=AVAILABLE&down_message=DOWN&url=https%3A%2F%2Fclickhouse.com%2Fdocs&style=for-the-badge)](https://clickhouse.com)
44
[![CC BY-NC-SA 4.0 License](https://img.shields.io/badge/license-CC-blueviolet?style=for-the-badge)](http://creativecommons.org/licenses/by-nc-sa/4.0/)
5-
![Checks](https://img.shields.io/github/actions/workflow/status/clickhouse/clickhouse-docs/debug.yml?style=for-the-badge&label=Checks)
5+
[![Checks](https://img.shields.io/github/actions/workflow/status/clickhouse/clickhouse-docs/debug.yml?style=for-the-badge&label=Checks)](https://github.com/ClickHouse/clickhouse-docs/actions)
66

77
<picture align=center>
88
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/ClickHouse/clickhouse-docs/assets/9611008/4ef9c104-2d3f-4646-b186-507358d2fe28">
@@ -18,19 +18,19 @@
1818

1919
ClickHouse is blazing fast, but understanding ClickHouse and using it effectively is a journey. The documentation is your source for gaining the knowledge you need to be successful with your ClickHouse projects and applications. [Head over to clickhouse.com/docs to learn more →](https://clickhouse.com/)
2020

21-
## Table of contents {#table-of-contents}
21+
## Table of contents
2222

2323
- [About this repo](#about-this-repo)
2424
- [Run locally](#run-locally)
2525
- [Contributing](#contributing)
2626
- [Issues](#issues)
2727
- [License](#license)
2828

29-
## About this repo {#about-this-repo}
29+
## About this repo
3030

3131
This repository manages the documentation for [ClickHouse](https://clickhouse.com/docs). The content is built with [Docusaurus](https://docusaurus.io/) and hosted on [Vercel](https://vercel.com). Documentation content is written in Markdown and is held in the `/docs` directory.
3232

33-
## Run locally {#run-locally}
33+
## Run locally
3434

3535
You can run a copy of this website locally within a few steps. Some folks find this useful when contributing so they can see precisely what their changes will look like on the production site.
3636

@@ -116,23 +116,19 @@ To check spelling and markdown is correct locally run:
116116
yarn check-style
117117
```
118118
119-
### Notes {#notes}
119+
### Notes
120120
121121
Here are some things to keep in mind when building a local copy of the ClickHouse docs site.
122122
123-
#### Build-time {#build-time}
124-
125-
Due to the complex structure of this repo, the docs site can take some time to build locally. As a benchmark, it takes ~3 minutes to build on an M1 Macbook with 8GB RAM.
126-
127-
#### Redirects {#redirects}
123+
#### Redirects
128124
129125
Due to how the local server is built, redirects will not work. For example, visiting `clickhouse.com/docs` on the production site will lead you to `clickhouse.com/docs/intro`. However, on a local copy of the site, you will see a 404 page if you try to visit `localhost:8000/docs`.
130126
131-
## Contributing {#contributing}
127+
## Contributing
132128
133129
Want to help out? Contributions are always welcome! If you want to help out but aren't sure where to start, check out the [issues board](https://github.com/clickhouse/clickhouse-docs/issues).
134130

135-
### Pull requests {#pull-requests}
131+
### Pull requests
136132

137133
Please assign any pull request (PR) against an issue; this helps the docs team track who is working on what and what each PR is meant to address. If there isn't an issue for the specific _thing_ you want to work on, quickly create one and comment so that it can be assigned to you. One of the repository maintainers will add you as an assignee.
138134
@@ -153,7 +149,7 @@ yarn check-style
153149
For an overview of how reference documentation such as settings, system tables
154150
and functions are generated from the source code, see ["Generating documentation from source code"](/contribute/autogenerated-documentation-from-source.md)
155151
156-
### Tests and CI/CD {#tests-and-cicd}
152+
### Tests and CI/CD
157153
158154
There are five workflows that run against PRs in this repo:
159155
@@ -165,7 +161,7 @@ There are five workflows that run against PRs in this repo:
165161
| [Scheduled Vercel build](https://github.com/ClickHouse/clickhouse-docs/blob/main/.github/workflows/scheduled-vercel-build.yml) | Builds the site every day at 00:10 UTC and hosts the build on Vercel. |
166162
| [Trigger build](https://github.com/ClickHouse/clickhouse-docs/blob/main/.github/workflows/trigger-build.yml) | Uses the [peter-evans/repository-dispatch@v2](https://github.com/peter-evans/repository-dispatch) workflow to create a repository dispatch. |
167163
168-
### Quick contributions {#quick-contributions}
164+
### Quick contributions
169165
170166
Have you noticed a typo or found some wonky formatting? For small contributions like these, it's usually faster and easier to make your changes directly in GitHub. Here's a quick guide to show you how the GitHub editor works:
171167
@@ -200,10 +196,10 @@ Have you noticed a typo or found some wonky formatting? For small contributions
200196
201197
At this point, your pull request will be handed over to the docs team, who will review it and suggest or make changes where necessary.
202198
203-
## Issues {#issues}
199+
## Issues
204200
205201
Found a problem with the Clickhouse docs site? [Please raise an issue](https://github.com/clickhouse/clickhouse-docs/issues/new). Be as specific and descriptive as possible; screenshots help!
206202
207-
## License {#license}
203+
## License
208204
209205
This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).

contribute/style-guide.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,7 @@ When using URL parameters to control which version of documentation is displayed
340340
there are conventions to follow for reliable functionality.
341341
Here's how the `?v=v08` parameter relates to the snippet selection:
342342

343-
#### How It Works
343+
#### How it works
344344

345345
The URL parameter acts as a selector that matches against the `version` property
346346
in your component configuration. For example:
@@ -393,3 +393,22 @@ show_related_blogs: true
393393

394394
This will show it on the page, assuming there is a matching blog. If there is no
395395
match then it remains hidden.
396+
397+
## Vale
398+
399+
Vale is a command-line tool that brings code-like linting to prose.
400+
We have a number of rules set up to ensure that our documentation is
401+
consistent in style.
402+
403+
The style rules are located at `/styles/ClickHouse`, and largely based
404+
off of the Google styleset, with some ClickHouse specific adaptions.
405+
If you want to check only a specific rule locally, you
406+
can run:
407+
408+
```bash
409+
vale --filter='.Name == "ClickHouse.Headings"' docs/integrations
410+
```
411+
412+
This will run only the rule named `Headings` on
413+
the `docs/integrations` directory. Specifying a specific markdown
414+
file is also possible.

docs/_snippets/_GCS_authentication_and_bucket.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ import Image from '@theme/IdealImage';
1919

2020
<Image size="md" img={GCS_bucket_2} alt="Creating a GCS bucket in US East 4" border />
2121

22-
### Generate an Access key {#generate-an-access-key}
22+
### Generate an access key {#generate-an-access-key}
2323

2424
### Create a service account HMAC key and secret {#create-a-service-account-hmac-key-and-secret}
2525

docs/_snippets/_add_superset_detail.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ There are a few tasks to be done before running `docker compose`:
1313
The commands below are to be run from the top level of the GitHub repo, `superset`.
1414
:::
1515

16-
## Official ClickHouse Connect driver {#official-clickhouse-connect-driver}
16+
## Official ClickHouse connect driver {#official-clickhouse-connect-driver}
1717

1818
To make the ClickHouse Connect driver available in the Superset deployment add it to the local requirements file:
1919

docs/_snippets/_users-and-roles-common.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -269,7 +269,7 @@ Roles are used to define groups of users for certain privileges instead of manag
269269
Verify that only the above two rows are returned, rows with the value `B` in `column1` should be excluded.
270270
:::
271271

272-
## Modifying Users and Roles {#modifying-users-and-roles}
272+
## Modifying users and roles {#modifying-users-and-roles}
273273

274274
Users can be assigned multiple roles for a combination of privileges needed. When using multiple roles, the system will combine the roles to determine privileges, the net effect will be that the role permissions will be cumulative.
275275

docs/about-us/beta-and-experimental-features.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,19 +14,19 @@ Due to the uncertainty of when features are classified as generally available, w
1414

1515
The sections below explicitly describe the properties of **Beta** and **Experimental** features:
1616

17-
## Beta Features {#beta-features}
17+
## Beta features {#beta-features}
1818

1919
- Under active development to make them generally available (GA)
2020
- Main known issues can be tracked on GitHub
2121
- Functionality may change in the future
2222
- Possibly enabled in ClickHouse Cloud
2323
- The ClickHouse team supports beta features
2424

25-
The following features are considered Beta in ClickHouse Cloud and are available for use in ClickHouse Cloud Services, even though they may be currently under a ClickHouse SETTING named ```allow_experimental_*```:
25+
You can find below the features considered Beta in ClickHouse Cloud and are available for use in your ClickHouse Cloud Services.
2626

2727
Note: please be sure to be using a current version of the ClickHouse [compatibility](/operations/settings/settings#compatibility) setting to be using a recently introduced feature.
2828

29-
## Experimental Features {#experimental-features}
29+
## Experimental features {#experimental-features}
3030

3131
- May never become GA
3232
- May be removed

docs/about-us/distinctive-features.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,81 +7,81 @@ title: 'Distinctive Features of ClickHouse'
77
keywords: ['compression', 'secondary-indexes','column-oriented']
88
---
99

10-
# Distinctive Features of ClickHouse
10+
# Distinctive features of ClickHouse
1111

12-
## True Column-Oriented Database Management System {#true-column-oriented-database-management-system}
12+
## True column-oriented database management system {#true-column-oriented-database-management-system}
1313

1414
In a real column-oriented DBMS, no extra data is stored with the values. This means that constant-length values must be supported to avoid storing their length "number" next to the values. For example, a billion UInt8-type values should consume around 1 GB uncompressed, or this strongly affects the CPU use. It is essential to store data compactly (without any "garbage") even when uncompressed since the speed of decompression (CPU usage) depends mainly on the volume of uncompressed data.
1515

1616
This is in contrast to systems that can store values of different columns separately, but that cannot effectively process analytical queries due to their optimization for other scenarios, such as HBase, Bigtable, Cassandra, and Hypertable. You would get throughput around a hundred thousand rows per second in these systems, but not hundreds of millions of rows per second.
1717

1818
Finally, ClickHouse is a database management system, not a single database. It allows creating tables and databases in runtime, loading data, and running queries without reconfiguring and restarting the server.
1919

20-
## Data Compression {#data-compression}
20+
## Data compression {#data-compression}
2121

2222
Some column-oriented DBMSs do not use data compression. However, data compression plays a key role in achieving excellent performance.
2323

2424
In addition to efficient general-purpose compression codecs with different trade-offs between disk space and CPU consumption, ClickHouse provides [specialized codecs](/sql-reference/statements/create/table.md#specialized-codecs) for specific kinds of data, which allow ClickHouse to compete with and outperform more niche databases, like time-series ones.
2525

26-
## Disk Storage of Data {#disk-storage-of-data}
26+
## Disk storage of data {#disk-storage-of-data}
2727

2828
Keeping data physically sorted by primary key makes it possible to extract data based on specific values or value ranges with low latency in less than a few dozen milliseconds. Some column-oriented DBMSs, such as SAP HANA and Google PowerDrill, can only work in RAM. This approach requires allocation of a larger hardware budget than necessary for real-time analysis.
2929

3030
ClickHouse is designed to work on regular hard drives, which means the cost per GB of data storage is low, but SSD and additional RAM are also fully used if available.
3131

32-
## Parallel Processing on Multiple Cores {#parallel-processing-on-multiple-cores}
32+
## Parallel processing on multiple cores {#parallel-processing-on-multiple-cores}
3333

3434
Large queries are parallelized naturally, taking all the necessary resources available on the current server.
3535

36-
## Distributed Processing on Multiple Servers {#distributed-processing-on-multiple-servers}
36+
## Distributed processing on multiple servers {#distributed-processing-on-multiple-servers}
3737

3838
Almost none of the columnar DBMSs mentioned above have support for distributed query processing.
3939

4040
In ClickHouse, data can reside on different shards. Each shard can be a group of replicas used for fault tolerance. All shards are used to run a query in parallel, transparently for the user.
4141

42-
## SQL Support {#sql-support}
42+
## SQL support {#sql-support}
4343

4444
ClickHouse supports [SQL language](/sql-reference/) that is mostly compatible with the ANSI SQL standard.
4545

4646
Supported queries include [GROUP BY](../sql-reference/statements/select/group-by.md), [ORDER BY](../sql-reference/statements/select/order-by.md), subqueries in [FROM](../sql-reference/statements/select/from.md), [JOIN](../sql-reference/statements/select/join.md) clause, [IN](../sql-reference/operators/in.md) operator, [window functions](../sql-reference/window-functions/index.md) and scalar subqueries.
4747

4848
Correlated (dependent) subqueries are not supported at the time of writing but might become available in the future.
4949

50-
## Vector Computation Engine {#vector-engine}
50+
## Vector computation engine {#vector-engine}
5151

5252
Data is not only stored by columns but is processed by vectors (parts of columns), which allows achieving high CPU efficiency.
5353

54-
## Real-Time Data Inserts {#real-time-data-updates}
54+
## Real-time data inserts {#real-time-data-updates}
5555

5656
ClickHouse supports tables with a primary key. To quickly perform queries on the range of the primary key, the data is sorted incrementally using the merge tree. Due to this, data can continually be added to the table. No locks are taken when new data is ingested.
5757

58-
## Primary Indexes {#primary-index}
58+
## Primary indexes {#primary-index}
5959

6060
Having data physically sorted by primary key makes it possible to extract data based on specific values or value ranges with low latency in less than a few dozen milliseconds.
6161

62-
## Secondary Indexes {#secondary-indexes}
62+
## Secondary indexes {#secondary-indexes}
6363

6464
Unlike other database management systems, secondary indexes in ClickHouse do not point to specific rows or row ranges. Instead, they allow the database to know in advance that all rows in some data parts would not match the query filtering conditions and do not read them at all, thus they are called [data skipping indexes](../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-data_skipping-indexes).
6565

66-
## Suitable for Online Queries {#suitable-for-online-queries}
66+
## Suitable for online queries {#suitable-for-online-queries}
6767

6868
Most OLAP database management systems do not aim for online queries with sub-second latencies. In alternative systems, report building time of tens of seconds or even minutes is often considered acceptable. Sometimes it takes even more time, which forces systems to prepare reports offline (in advance or by responding with "come back later").
6969

7070
In ClickHouse "low latency" means that queries can be processed without delay and without trying to prepare an answer in advance, right at the same moment as the user interface page is loading. In other words, online.
7171

72-
## Support for Approximated Calculations {#support-for-approximated-calculations}
72+
## Support for approximated calculations {#support-for-approximated-calculations}
7373

7474
ClickHouse provides various ways to trade accuracy for performance:
7575

7676
1. Aggregate functions for approximated calculation of the number of distinct values, medians, and quantiles.
7777
2. Running a query based on a part ([SAMPLE](../sql-reference/statements/select/sample.md)) of data and getting an approximated result. In this case, proportionally less data is retrieved from the disk.
7878
3. Running an aggregation for a limited number of random keys, instead of for all keys. Under certain conditions for key distribution in the data, this provides a reasonably accurate result while using fewer resources.
7979

80-
## Adaptive Join Algorithm {#adaptive-join-algorithm}
80+
## Adaptive join algorithm {#adaptive-join-algorithm}
8181

8282
ClickHouse adaptively chooses how to [JOIN](../sql-reference/statements/select/join.md) multiple tables, by preferring hash-join algorithm and falling back to the merge-join algorithm if there's more than one large table.
8383

84-
## Data Replication and Data Integrity Support {#data-replication-and-data-integrity-support}
84+
## Data replication and data integrity support {#data-replication-and-data-integrity-support}
8585

8686
ClickHouse uses asynchronous multi-master replication. After being written to any available replica, all the remaining replicas retrieve their copy in the background. The system maintains identical data on different replicas. Recovery after most failures is performed automatically, or semi-automatically in complex cases.
8787

@@ -91,7 +91,7 @@ For more information, see the section [Data replication](../engines/table-engine
9191

9292
ClickHouse implements user account management using SQL queries and allows for [role-based access control configuration](/guides/sre/user-management/index.md) similar to what can be found in ANSI SQL standard and popular relational database management systems.
9393

94-
## Features that Can Be Considered Disadvantages {#clickhouse-features-that-can-be-considered-disadvantages}
94+
## Features that can be considered disadvantages {#clickhouse-features-that-can-be-considered-disadvantages}
9595

9696
1. No full-fledged transactions.
9797
2. Lack of ability to modify or delete already inserted data with a high rate and low latency. There are batch deletes and updates available to clean up or modify data, for example, to comply with [GDPR](https://gdpr-info.eu).

0 commit comments

Comments
 (0)