|
| 1 | +# Antalya branch |
| 2 | + |
| 3 | +## Swarm |
| 4 | + |
| 5 | +### Difference with upstream version |
| 6 | + |
| 7 | +#### `storage_type` argument in object storage functions |
| 8 | + |
| 9 | +In upstream ClickHouse, there are several table functions to read Iceberg tables from different storage backends such as `icebergLocal`, `icebergS3`, `icebergAzure`, `icebergHDFS`, cluster variants, the `iceberg` function as a synonym for `icebergS3`, and table engines like `IcebergLocal`, `IcebergS3`, `IcebergAzure`, `IcebergHDFS`. |
| 10 | + |
| 11 | +In the Antalya branch, the `iceberg` table function and the `Iceberg` table engine unify all variants into one by using a new named argument, `storage_type`, which can be one of `local`, `s3`, `azure`, or `hdfs`. |
| 12 | + |
| 13 | +Old syntax examples: |
| 14 | + |
| 15 | +```sql |
| 16 | +SELECT * FROM icebergS3('http://minio1:9000/root/table_data', 'minio', 'minio123', 'Parquet'); |
| 17 | +SELECT * FROM icebergAzureCluster('mycluster', 'http://azurite1:30000/devstoreaccount1', 'cont', '/table_data', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'Parquet'); |
| 18 | +CREATE TABLE mytable ENGINE=IcebergHDFS('/table_data', 'Parquet'); |
| 19 | +``` |
| 20 | + |
| 21 | +New syntax examples: |
| 22 | + |
| 23 | +```sql |
| 24 | +SELECT * FROM iceberg(storage_type='s3', 'http://minio1:9000/root/table_data', 'minio', 'minio123', 'Parquet'); |
| 25 | +SELECT * FROM icebergCluster('mycluster', storage_type='azure', 'http://azurite1:30000/devstoreaccount1', 'cont', '/table_data', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'Parquet'); |
| 26 | +CREATE TABLE mytable ENGINE=Iceberg('/table_data', 'Parquet', storage_type='hdfs'); |
| 27 | +``` |
| 28 | + |
| 29 | +Also, if a named collection is used to store access parameters, the field `storage_type` can be included in the same named collection: |
| 30 | + |
| 31 | +```xml |
| 32 | +<named_collections> |
| 33 | + <s3> |
| 34 | + <url>http://minio1:9001/root/</url> |
| 35 | + <access_key_id>minio</access_key_id> |
| 36 | + <secret_access_key>minio123</secret_access_key> |
| 37 | + <storage_type>s3</storage_type> |
| 38 | + </s3> |
| 39 | +</named_collections> |
| 40 | +``` |
| 41 | + |
| 42 | +```sql |
| 43 | +SELECT * FROM iceberg(s3, filename='table_data'); |
| 44 | +``` |
| 45 | + |
| 46 | +By default `storage_type` is `'s3'` to maintain backward compatibility. |
| 47 | + |
| 48 | + |
| 49 | +#### `object_storage_cluster` setting |
| 50 | + |
| 51 | +The new setting `object_storage_cluster` controls whether a single-node or cluster variant of table functions reading from object storage (e.g., `s3`, `azure`, `iceberg`, and their cluster variants like `s3Cluster`, `azureCluster`, `icebergCluster`) is used. |
| 52 | + |
| 53 | +Old syntax examples: |
| 54 | + |
| 55 | +```sql |
| 56 | +SELECT * from s3Cluster('myCluster', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', |
| 57 | + 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))'); |
| 58 | +SELECT * FROM icebergAzureCluster('mycluster', 'http://azurite1:30000/devstoreaccount1', 'cont', '/table_data', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'Parquet'); |
| 59 | +``` |
| 60 | + |
| 61 | +New syntax examples: |
| 62 | + |
| 63 | +```sql |
| 64 | +SELECT * from s3('http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', |
| 65 | + 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))') |
| 66 | + SETTINGS object_storage_cluster='myCluster'; |
| 67 | +SELECT * FROM icebergAzure('http://azurite1:30000/devstoreaccount1', 'cont', '/table_data', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'Parquet') |
| 68 | + SETTINGS object_storage_cluster='myCluster'; |
| 69 | +``` |
| 70 | + |
| 71 | +This setting also applies to table engines and can be used with tables managed by Iceberg Catalog. |
| 72 | + |
| 73 | +Note: The upstream ClickHouse has introduced analogous settings, such as `parallel_replicas_for_cluster_engines` and `cluster_for_parallel_replicas`. Since version 25.10, these settings work with table engines. It is possible that in the future, the `object_storage_cluster` setting will be deprecated. |
0 commit comments