Skip to content

Commit 832c561

Browse files
authored
Merge branch 'main' into ch-audit-splunk-integration-guide
2 parents ebefb71 + 8b33368 commit 832c561

File tree

152 files changed

+6559
-5904
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+6559
-5904
lines changed

clickhouseapi.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ function generateDocusaurusMarkdown(spec, groupedEndpoints, prefix) {
5252

5353
markdownContent += `| Method | Path |\n`
5454
markdownContent += `| :----- | :--- |\n`
55-
markdownContent += `| ${method.toUpperCase()} | ${path} |\n\n`
55+
markdownContent += `| ${method.toUpperCase()} | \`${path}\` |\n\n`
5656

5757
markdownContent += `### Request\n\n`;
5858

copyClickhouseRepoDocs.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
#! ./bin/bash
1+
#! /bin/bash
22

33
SCRIPT_NAME=$(basename "$0")
44

5+
rm -rf ClickHouse
56
echo "[$SCRIPT_NAME] Start tasks for copying docs from ClickHouse repo"
67

78
# Clone ClickHouse repo

docs/en/cloud/bestpractices/avoidoptimizefinal.md

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,32 @@
22
slug: /en/cloud/bestpractices/avoid-optimize-final
33
sidebar_label: Avoid Optimize Final
44
title: Avoid Optimize Final
5+
keywords: ['OPTIMIZE TABLE', 'FINAL', 'unscheduled merge']
56
---
67

7-
Using the [`OPTIMIZE TABLE ... FINAL`](/docs/en/sql-reference/statements/optimize/) query will initiate an unscheduled merge of data parts for the specific table into one data part. During this process, ClickHouse reads all the data parts, uncompresses, merges, compresses them into a single part, and then rewrites back into object store, causing huge CPU and IO consumption.
8+
Using the [`OPTIMIZE TABLE ... FINAL`](/docs/en/sql-reference/statements/optimize/) query initiates an unscheduled merge of data parts for a specific table into one single data part.
9+
During this process, ClickHouse performs the following steps:
10+
11+
- Data parts are read.
12+
- The parts get uncompressed.
13+
- The parts get merged.
14+
- They are compressed into a single part.
15+
- The part is then written back into the object store.
16+
17+
The operations described above are resource intensive, consuming significant CPU and disk I/O.
18+
It is important to note that using this optimization will force a rewrite of a part,
19+
even if merging to a single part has already occurred.
20+
21+
Additionally, use of the `OPTIMIZE TABLE ... FINAL` query may disregard
22+
setting [`max_bytes_to_merge_at_max_space_in_pool`](https://clickhouse.com/docs/en/operations/settings/merge-tree-settings#max-bytes-to-merge-at-max-space-in-pool) which controls the maximum size of parts
23+
that ClickHouse will typically merge by itself in the background.
24+
25+
The [`max_bytes_to_merge_at_max_space_in_pool`](https://clickhouse.com/docs/en/operations/settings/merge-tree-settings#max-bytes-to-merge-at-max-space-in-pool) setting is by default set to 150 GB.
26+
When running `OPTIMIZE TABLE ... FINAL`,
27+
the steps outlined above will be performed resulting in a single part after merge.
28+
This remaining single part could exceed the 150 GB specified by the default of this setting.
29+
This is another important consideration and reason why you should avoid use of this statement,
30+
since merging a large number of 150 GB parts into a single part could require a significant amount of time and/or memory.
831

9-
Note that this optimization rewrites the one part even if they are already merged into a single part. Also, it is important to note the scope of a "single part" - this indicates that the value of the setting [`max_bytes_to_merge_at_max_space_in_pool`](https://clickhouse.com/docs/en/operations/settings/merge-tree-settings#max-bytes-to-merge-at-max-space-in-pool) will be ignored. For example, [`max_bytes_to_merge_at_max_space_in_pool`](https://clickhouse.com/docs/en/operations/settings/merge-tree-settings#max-bytes-to-merge-at-max-space-in-pool) is by default set to 150 GB. When running OPTIMIZE TABLE ... FINAL, the remaining single part could exceed even this size. This is another important consideration and reason not to generally use this command, since merging a large number of 150 GB parts into a single part could require a significant amount of time and/or memory.
1032

1133

docs/en/cloud/reference/changelog.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -957,7 +957,7 @@ Adds support for a subset of features in ClickHouse 23.1, for example:
957957
- New functions, including `age()`, `quantileInterpolatedWeighted()`, `quantilesInterpolatedWeighted()`
958958
- Ability to use structure from insertion table in `generateRandom` without arguments
959959
- Improved database creation and rename logic that allows the reuse of previous names
960-
- See the 23.1 release [webinar slides](https://presentations.clickhouse.com/release_23.1/#cover) and [23.1 release changelog](/docs/en/whats-new/changelog/index.md/#clickhouse-release-231) for more details
960+
- See the 23.1 release [webinar slides](https://presentations.clickhouse.com/release_23.1/#cover) and [23.1 release changelog](/docs/en/whats-new/changelog/index.md#clickhouse-release-231) for more details
961961

962962
### Integrations changes
963963
- [Kafka-Connect](/docs/en/integrations/data-ingestion/kafka/index.md): Added support for Amazon MSK

docs/en/cloud/reference/warehouses.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,9 @@ settings distributed_ddl_task_timeout=0
157157
6. **The original service should be new enough, or migrated**
158158
Unfortunately, not all existing services can share their storage with other services. During the last year, we released a few features that the service needs to support (like the Shared Merge Tree engine), so old services will mostly not be able to share their data with other services. This does not depend on ClickHouse version. The good news is that we can migrate the old service to the new engine, so it can support creating additional services. If you have a service for which you cannot enable compute-compute separation, please contact support to assist with the migration.
159159

160+
7. **Single-node secondary services can be unavailable for up to 1 hour during upgrades**
161+
When creating a database service, you can select the number of replicas. When creating a secondary service, you can select to create a single-node service, which means that there will be no high availability for this particular service. Currently, when performing an upgrade of such a service, a usual rolling upgrade can not be performed, which means that the single-node service will be unavailable during the upgrade. Though usually an upgrade takes only a few minutes, in some cases, if there are long-running queries, it can take up to one hour. The single-node service will be unavailable during this time. Consider creating at least two nodes service if this is not acceptable - in this case, there will be no downtime at all. We are working on removing this limitation.
162+
160163
## Pricing
161164

162165
Extra services created during the private preview are billed as usual. Compute prices are the same for all services in a warehouse (primary and secondary). Storage is billed only once - it is included in the first (original) service.

docs/en/data-compression/compression-modes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ From [facebook benchmarks](https://facebook.github.io/zstd/#benchmarks):
4141
| mode | byte | Compression mode |
4242
| compressed_data | binary | Block of compressed data |
4343

44-
![compression block diagram](../native-protocol/images/ch_compression_block.drawio.svg)
44+
![compression block diagram](./images/ch_compression_block.png)
4545

4646
Header is (raw_size + data_size + mode), raw size consists of len(header + compressed_data).
4747

10.6 KB
Loading

docs/en/data-modeling/backfilling.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -448,7 +448,7 @@ GROUP BY
448448

449449
Here, we create a Null table, `pypi_v2,` to receive the rows that will be used to build our materialized view. Note how we limit the schema to only the columns we need. Our materialized view performs an aggregation over rows inserted into this table (one block at a time), sending the results to our target table, `pypi_downloads_per_day`.
450450

451-
::note
451+
:::note
452452
We have used `pypi_downloads_per_day` as our target table here. For additional resiliency, users could create a duplicate table, `pypi_downloads_per_day_v2`, and use this as the target table of the view, as shown in previous examples. On completion of the insert, partitions in `pypi_downloads_per_day_v2` could, in turn, be moved to `pypi_downloads_per_day.` This would allow recovery in the case our insert fails due to memory issues or server interruptions i.e. we just truncate `pypi_downloads_per_day_v2`, tune settings, and retry.
453453
:::
454454

docs/en/guides/developer/understanding-query-execution-with-the-analyzer.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,12 +63,10 @@ Each node has corresponding children and the overall tree represents the overall
6363

6464
## Analyzer
6565

66-
<BetaBadge />
67-
68-
ClickHouse currently has two architectures for the Analyzer. You can use the old architecture by setting: `allow_experimental_analyzer=0`. If you want to use the new architecture, you should set `allow_experimental_analyzer=1`. We are going to describe only the new architecture here, given the old one is going to be deprecated once the new analyzer is generally available.
66+
ClickHouse currently has two architectures for the Analyzer. You can use the old architecture by setting: `enable_analyzer=0`. The new architecture is enabled by default. We are going to describe only the new architecture here, given the old one is going to be deprecated once the new analyzer is generally available.
6967

7068
:::note
71-
The new analyzer is in Beta. The new architecture should provide us with a better framework to improve ClickHouse's performance. However, given it is a fundamental component of the query processing steps, it also might have a negative impact on some queries. After moving to the new analyzer, you may see performance degradation, queries failing, or queries giving you an unexpected result. You can revert back to the old analyzer by changing the `allow_experimental_analyzer` setting at the query or user level. Please report any issues in GitHub.
69+
The new architecture should provide us with a better framework to improve ClickHouse's performance. However, given it is a fundamental component of the query processing steps, it also might have a negative impact on some queries and there are [known incompatibilities](/docs/en/operations/analyzer#known-incompatibilities). You can revert back to the old analyzer by changing the `enable_analyzer` setting at the query or user level.
7270
:::
7371

7472
The analyzer is an important step of the query execution. It takes an AST and transforms it into a query tree. The main benefit of a query tree over an AST is that a lot of the components will be resolved, like the storage for instance. We also know from which table to read, aliases are also resolved, and the tree knows the different data types used. With all these benefits, the analyzer can apply optimizations. The way these optimizations work is via “passes”. Every pass is going to look for different optimizations. You can see all the passes [here](https://github.com/ClickHouse/ClickHouse/blob/76578ebf92af3be917cd2e0e17fea2965716d958/src/Analyzer/QueryTreePassManager.cpp#L249), let’s see it in practice with our previous query:

docs/en/guides/inserting-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ With asynchronous inserts, data is inserted into a buffer first and then written
8989
<br />
9090

9191
<img src={require('./images/postgres-inserts.png').default}
92-
class="image"
92+
className="image"
9393
alt="NEEDS ALT"
9494
style={{width: '600px'}}
9595
/>

0 commit comments

Comments
 (0)