Skip to content

Commit 552502b

Browse files
committed
merge antalya-25.8
2 parents 8f21b8c + c0c3fe9 commit 552502b

File tree

389 files changed

+17325
-1851
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

389 files changed

+17325
-1851
lines changed

.github/workflows/master.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4178,7 +4178,7 @@ jobs:
41784178
secrets: inherit
41794179
with:
41804180
runner_type: altinity-on-demand, altinity-regression-tester
4181-
commit: c07440a1ad14ffc5fc49ce90dff2f40c2e5f364d
4181+
commit: 00a50b5b8f12c9c603b9a3fa17dd2c5ea2012cac
41824182
arch: release
41834183
build_sha: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
41844184
timeout_minutes: 300
@@ -4190,7 +4190,7 @@ jobs:
41904190
secrets: inherit
41914191
with:
41924192
runner_type: altinity-on-demand, altinity-regression-tester-aarch64
4193-
commit: c07440a1ad14ffc5fc49ce90dff2f40c2e5f364d
4193+
commit: 00a50b5b8f12c9c603b9a3fa17dd2c5ea2012cac
41944194
arch: aarch64
41954195
build_sha: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
41964196
timeout_minutes: 300

.github/workflows/pull_request.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4134,7 +4134,7 @@ jobs:
41344134
secrets: inherit
41354135
with:
41364136
runner_type: altinity-on-demand, altinity-regression-tester
4137-
commit: c07440a1ad14ffc5fc49ce90dff2f40c2e5f364d
4137+
commit: 00a50b5b8f12c9c603b9a3fa17dd2c5ea2012cac
41384138
arch: release
41394139
build_sha: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
41404140
timeout_minutes: 300
@@ -4146,7 +4146,7 @@ jobs:
41464146
secrets: inherit
41474147
with:
41484148
runner_type: altinity-on-demand, altinity-regression-tester-aarch64
4149-
commit: c07440a1ad14ffc5fc49ce90dff2f40c2e5f364d
4149+
commit: 00a50b5b8f12c9c603b9a3fa17dd2c5ea2012cac
41504150
arch: aarch64
41514151
build_sha: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
41524152
timeout_minutes: 300

.gitmodules

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
url = https://github.com/Thalhammer/jwt-cpp
77
[submodule "contrib/zstd"]
88
path = contrib/zstd
9-
url = https://github.com/facebook/zstd
9+
url = https://github.com/ClickHouse/zstd.git
1010
[submodule "contrib/lz4"]
1111
path = contrib/lz4
1212
url = https://github.com/lz4/lz4
@@ -45,7 +45,7 @@
4545
url = https://github.com/ClickHouse/arrow
4646
[submodule "contrib/thrift"]
4747
path = contrib/thrift
48-
url = https://github.com/apache/thrift
48+
url = https://github.com/ClickHouse/thrift.git
4949
[submodule "contrib/libhdfs3"]
5050
path = contrib/libhdfs3
5151
url = https://github.com/ClickHouse/libhdfs3
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
1
2+
raw_blob String
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
-- Tags: no-parallel, no-fasttest, no-random-settings
2+
3+
INSERT INTO FUNCTION s3(
4+
s3_conn,
5+
filename='03631',
6+
format=Parquet,
7+
partition_strategy='hive',
8+
partition_columns_in_data_file=1) PARTITION BY (year, country) SELECT 'Brazil' as country, 2025 as year, 1 as id;
9+
10+
-- distinct because minio isn't cleaned up
11+
SELECT count(distinct year) FROM s3(s3_conn, filename='03631/**.parquet', format=RawBLOB) SETTINGS use_hive_partitioning=1;
12+
13+
DESCRIBE s3(s3_conn, filename='03631/**.parquet', format=RawBLOB) SETTINGS use_hive_partitioning=1;

ci/praktika/yaml_additional_templates.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ class AltinityWorkflowTemplates:
3535
echo "Workflow Run Report: [View Report]($REPORT_LINK)" >> $GITHUB_STEP_SUMMARY
3636
"""
3737
# Additional jobs
38-
REGRESSION_HASH = "c07440a1ad14ffc5fc49ce90dff2f40c2e5f364d"
38+
REGRESSION_HASH = "00a50b5b8f12c9c603b9a3fa17dd2c5ea2012cac"
3939
ALTINITY_JOBS = {
4040
"GrypeScan": r"""
4141
GrypeScanServer:
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# ALTER TABLE EXPORT PART
2+
3+
## Overview
4+
5+
The `ALTER TABLE EXPORT PART` command exports individual MergeTree data parts to object storage (S3, Azure Blob Storage, etc.), typically in Parquet format.
6+
7+
**Key Characteristics:**
8+
- **Experimental feature** - must be enabled via `allow_experimental_export_merge_tree_part` setting
9+
- **Asynchronous** - executes in the background, returns immediately
10+
- **Ephemeral** - no automatic retry mechanism; manual retry required on failure
11+
- **Idempotent** - safe to re-export the same part (skips by default if file exists)
12+
- **Preserves sort order** from the source table
13+
14+
## Syntax
15+
16+
```sql
17+
ALTER TABLE [database.]table_name
18+
EXPORT PART 'part_name'
19+
TO TABLE [destination_database.]destination_table
20+
SETTINGS allow_experimental_export_merge_tree_part = 1
21+
[, setting_name = value, ...]
22+
```
23+
24+
### Parameters
25+
26+
- **`table_name`**: The source MergeTree table containing the part to export
27+
- **`part_name`**: The exact name of the data part to export (e.g., `'2020_1_1_0'`, `'all_1_1_0'`)
28+
- **`destination_table`**: The target table for the export (typically an S3, Azure, or other object storage table)
29+
30+
## Requirements
31+
32+
Source and destination tables must be 100% compatible:
33+
34+
1. **Identical schemas** - same columns, types, and order
35+
2. **Matching partition keys** - partition expressions must be identical
36+
37+
## Settings
38+
39+
### `allow_experimental_export_merge_tree_part` (Required)
40+
41+
- **Type**: `Bool`
42+
- **Default**: `false`
43+
- **Description**: Must be set to `true` to enable the experimental feature.
44+
45+
### `export_merge_tree_part_overwrite_file_if_exists` (Optional)
46+
47+
- **Type**: `Bool`
48+
- **Default**: `false`
49+
- **Description**: If set to `true`, it will overwrite the file. Otherwise, fails with exception.
50+
51+
## Examples
52+
53+
### Basic Export to S3
54+
55+
```sql
56+
-- Create source and destination tables
57+
CREATE TABLE mt_table (id UInt64, year UInt16)
58+
ENGINE = MergeTree() PARTITION BY year ORDER BY tuple();
59+
60+
CREATE TABLE s3_table (id UInt64, year UInt16)
61+
ENGINE = S3(s3_conn, filename='data', format=Parquet, partition_strategy='hive')
62+
PARTITION BY year;
63+
64+
-- Insert and export
65+
INSERT INTO mt_table VALUES (1, 2020), (2, 2020), (3, 2021);
66+
67+
ALTER TABLE mt_table EXPORT PART '2020_1_1_0' TO TABLE s3_table
68+
SETTINGS allow_experimental_export_merge_tree_part = 1;
69+
70+
ALTER TABLE mt_table EXPORT PART '2021_2_2_0' TO TABLE s3_table
71+
SETTINGS allow_experimental_export_merge_tree_part = 1;
72+
```
73+
74+
## Monitoring
75+
76+
### Active Exports
77+
78+
Active exports can be found in the `system.exports` table. As of now, it only shows currently executing exports. It will not show pending or finished exports.
79+
80+
```sql
81+
arthur :) select * from system.exports;
82+
83+
SELECT *
84+
FROM system.exports
85+
86+
Query id: 2026718c-d249-4208-891b-a271f1f93407
87+
88+
Row 1:
89+
──────
90+
source_database: default
91+
source_table: source_mt_table
92+
destination_database: default
93+
destination_table: destination_table
94+
create_time: 2025-11-19 09:09:11
95+
part_name: 20251016-365_1_1_0
96+
destination_file_path: table_root/eventDate=2025-10-16/retention=365/20251016-365_1_1_0_17B2F6CD5D3C18E787C07AE3DAF16EB1.parquet
97+
elapsed: 2.04845441
98+
rows_read: 1138688 -- 1.14 million
99+
total_rows_to_read: 550961374 -- 550.96 million
100+
total_size_bytes_compressed: 37619147120 -- 37.62 billion
101+
total_size_bytes_uncompressed: 138166213721 -- 138.17 billion
102+
bytes_read_uncompressed: 316892925 -- 316.89 million
103+
memory_usage: 596006095 -- 596.01 million
104+
peak_memory_usage: 601239033 -- 601.24 million
105+
```
106+
107+
### Export History
108+
109+
You can query succeeded or failed exports in `system.part_log`. For now, it only keeps track of completion events (either success or fails).
110+
111+
```sql
112+
arthur :) select * from system.part_log where event_type='ExportPart' and table = 'replicated_source' order by event_time desc limit 1;
113+
114+
SELECT *
115+
FROM system.part_log
116+
WHERE (event_type = 'ExportPart') AND (`table` = 'replicated_source')
117+
ORDER BY event_time DESC
118+
LIMIT 1
119+
120+
Query id: ae1c1cd3-c20e-4f20-8b82-ed1f6af0237f
121+
122+
Row 1:
123+
──────
124+
hostname: arthur
125+
query_id:
126+
event_type: ExportPart
127+
merge_reason: NotAMerge
128+
merge_algorithm: Undecided
129+
event_date: 2025-11-19
130+
event_time: 2025-11-19 09:08:31
131+
event_time_microseconds: 2025-11-19 09:08:31.974701
132+
duration_ms: 4
133+
database: default
134+
table: replicated_source
135+
table_uuid: 78471c67-24f4-4398-9df5-ad0a6c3daf41
136+
part_name: 2021_0_0_0
137+
partition_id: 2021
138+
partition: 2021
139+
part_type: Compact
140+
disk_name: default
141+
path_on_disk: year=2021/2021_0_0_0_78C704B133D41CB0EF64DD2A9ED3B6BA.parquet
142+
rows: 1
143+
size_in_bytes: 272
144+
merged_from: ['2021_0_0_0']
145+
bytes_uncompressed: 86
146+
read_rows: 1
147+
read_bytes: 6
148+
peak_memory_usage: 22
149+
error: 0
150+
exception:
151+
ProfileEvents: {}
152+
```
153+
154+
### Profile Events
155+
156+
- `PartsExports` - Successful exports
157+
- `PartsExportFailures` - Failed exports
158+
- `PartsExportDuplicated` - Number of part exports that failed because target already exists.
159+
- `PartsExportTotalMilliseconds` - Total time
160+

0 commit comments

Comments
 (0)