Skip to content

Commit 6bc4eab

Browse files
authored
Merge pull request #3797 from ClickHouse/kp/clickpipes-azure-blob-storage
Add note of azure blob storage clickpipe private preview
2 parents 99c1859 + d68295f commit 6bc4eab

File tree

4 files changed

+26
-5
lines changed

4 files changed

+26
-5
lines changed

docs/integrations/data-ingestion/clickpipes/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ import S3svg from '@site/static/images/integrations/logos/amazon_s3_logo.svg';
1414
import Amazonkinesis from '@site/static/images/integrations/logos/amazon_kinesis_logo.svg';
1515
import Gcssvg from '@site/static/images/integrations/logos/gcs.svg';
1616
import DOsvg from '@site/static/images/integrations/logos/digitalocean.svg';
17+
import ABSsvg from '@site/static/images/integrations/logos/azureblobstorage.svg';
1718
import Postgressvg from '@site/static/images/integrations/logos/postgresql.svg';
1819
import Mysqlsvg from '@site/static/images/integrations/logos/mysql.svg';
1920
import redpanda_logo from '@site/static/images/integrations/logos/logo_redpanda.png';
@@ -42,7 +43,7 @@ import Image from '@theme/IdealImage';
4243
| Amazon S3 | <S3svg class="image" alt="Amazon S3 logo" style={{width: '3rem', height: 'auto'}}/> |Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
4344
| Google Cloud Storage | <Gcssvg class="image" alt="Google Cloud Storage logo" style={{width: '3rem', height: 'auto'}}/> |Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
4445
| DigitalOcean Spaces | <DOsvg class="image" alt="Digital Ocean logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Stable | Configure ClickPipes to ingest large volumes of data from object storage.
45-
46+
| Azure Blob Storage | <ABSsvg class="image" alt="Azure Blob Storage logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Private Beta | Configure ClickPipes to ingest large volumes of data from object storage.
4647
| Amazon Kinesis | <Amazonkinesis class="image" alt="Amazon Kenesis logo" style={{width: '3rem', height: 'auto'}}/> |Streaming| Stable | Configure ClickPipes and start ingesting streaming data from Amazon Kinesis into ClickHouse cloud. |
4748
| Postgres | <Postgressvg class="image" alt="Postgres logo" style={{width: '3rem', height: 'auto'}}/> |DBMS| Public Beta | Configure ClickPipes and start ingesting data from Postgres into ClickHouse Cloud. |
4849
| MySQL | <Mysqlsvg class="image" alt="MySQL logo" style={{width: '3rem', height: 'auto'}}/> |DBMS| Private Beta | Configure ClickPipes and start ingesting data from MySQL into ClickHouse Cloud. |

docs/integrations/data-ingestion/clickpipes/object-storage.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ title: 'Integrating Object Storage with ClickHouse Cloud'
88
import S3svg from '@site/static/images/integrations/logos/amazon_s3_logo.svg';
99
import Gcssvg from '@site/static/images/integrations/logos/gcs.svg';
1010
import DOsvg from '@site/static/images/integrations/logos/digitalocean.svg';
11+
import ABSsvg from '@site/static/images/integrations/logos/azureblobstorage.svg';
1112
import cp_step0 from '@site/static/images/integrations/data-ingestion/clickpipes/cp_step0.png';
1213
import cp_step1 from '@site/static/images/integrations/data-ingestion/clickpipes/cp_step1.png';
1314
import cp_step2_object_storage from '@site/static/images/integrations/data-ingestion/clickpipes/cp_step2_object_storage.png';
@@ -23,7 +24,7 @@ import cp_overview from '@site/static/images/integrations/data-ingestion/clickpi
2324
import Image from '@theme/IdealImage';
2425

2526
# Integrating Object Storage with ClickHouse Cloud
26-
Object Storage ClickPipes provide a simple and resilient way to ingest data from Amazon S3, Google Cloud Storage, and DigitalOcean Spaces into ClickHouse Cloud. Both one-time and continuous ingestion are supported with exactly-once semantics.
27+
Object Storage ClickPipes provide a simple and resilient way to ingest data from Amazon S3, Google Cloud Storage, Azure Blob Storage, and DigitalOcean Spaces into ClickHouse Cloud. Both one-time and continuous ingestion are supported with exactly-once semantics.
2728

2829

2930
## Prerequisite {#prerequisite}
@@ -95,6 +96,7 @@ Image
9596
| Amazon S3 |<S3svg class="image" alt="Amazon S3 logo" style={{width: '3rem', height: 'auto'}}/>|Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
9697
| Google Cloud Storage |<Gcssvg class="image" alt="Google Cloud Storage logo" style={{width: '3rem', height: 'auto'}}/>|Object Storage| Stable | Configure ClickPipes to ingest large volumes of data from object storage. |
9798
| DigitalOcean Spaces | <DOsvg class="image" alt="Digital Ocean logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Stable | Configure ClickPipes to ingest large volumes of data from object storage.
99+
| Azure Blob Storage | <ABSsvg class="image" alt="Azure Blob Storage logo" style={{width: '3rem', height: 'auto'}}/> | Object Storage | Private Beta | Configure ClickPipes to ingest large volumes of data from object storage.
98100

99101
More connectors will get added to ClickPipes, you can find out more by [contacting us](https://clickhouse.com/company/contact?loc=clickpipes).
100102

@@ -126,13 +128,13 @@ To increase the throughput on large ingest jobs, we recommend scaling the ClickH
126128
- There are limitations on the types of views that are supported. Please read the section on [exactly-once semantics](#exactly-once-semantics) and [view support](#view-support) for more information.
127129
- Role authentication is not available for S3 ClickPipes for ClickHouse Cloud instances deployed into GCP or Azure. It is only supported for AWS ClickHouse Cloud instances.
128130
- ClickPipes will only attempt to ingest objects at 10GB or smaller in size. If a file is greater than 10GB an error will be appended to the ClickPipes dedicated error table.
129-
- S3 / GCS ClickPipes **does not** share a listing syntax with the [S3 Table Function](/sql-reference/table-functions/s3).
131+
- S3 / GCS ClickPipes **does not** share a listing syntax with the [S3 Table Function](/sql-reference/table-functions/s3), nor Azure with the [AzureBlobStorage Table function](/sql-reference/table-functions/azureBlobStorage).
130132
- `?` — Substitutes any single character
131133
- `*` — Substitutes any number of any characters except / including empty string
132134
- `**` — Substitutes any number of any character include / including empty string
133135

134136
:::note
135-
This is a valid path:
137+
This is a valid path (for S3):
136138

137139
https://datasets-documentation.s3.eu-west-3.amazonaws.com/http/**.ndjson.gz
138140

@@ -143,7 +145,7 @@ https://datasets-documentation.s3.eu-west-3.amazonaws.com/http/{documents-01,doc
143145
:::
144146

145147
## Continuous Ingest {#continuous-ingest}
146-
ClickPipes supports continuous ingestion from S3, GCS, and DigitalOcean Spaces. When enabled, ClickPipes will continuously ingest data from the specified path, it will poll for new files at a rate of once every 30 seconds. However, new files must be lexically greater than the last ingested file, meaning they must be named in a way that defines the ingestion order. For instance, files named `file1`, `file2`, `file3`, etc., will be ingested sequentially. If a new file is added with a name like `file0`, ClickPipes will not ingest it because it is not lexically greater than the last ingested file.
148+
ClickPipes supports continuous ingestion from S3, GCS, Azure Blob Storage, and DigitalOcean Spaces. When enabled, ClickPipes continuously ingests data from the specified path, and polls for new files at a rate of once every 30 seconds. However, new files must be lexically greater than the last ingested file. This means that they must be named in a way that defines the ingestion order. For instance, files named `file1`, `file2`, `file3`, etc., will be ingested sequentially. If a new file is added with a name like `file0`, ClickPipes will not ingest it because it is not lexically greater than the last ingested file.
147149

148150
## Archive table {#archive-table}
149151
ClickPipes will create a table next to your destination table with the postfix `s3_clickpipe_<clickpipe_id>_archive`. This table will contain a list of all the files that have been ingested by the ClickPipe. This table is used to track files during ingestion and can be used to verify files have been ingested. The archive table has a [TTL](/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-ttl) of 7 days.
@@ -167,6 +169,8 @@ The Service Account permissions attached to the HMAC credentials should be `stor
167169
### DigitalOcean Spaces {#dospaces}
168170
Currently only protected buckets are supported for DigitalOcean spaces. You require an "Access Key" and a "Secret Key" to access the bucket and its files. You can read [this guide](https://docs.digitalocean.com/products/spaces/how-to/manage-access/) on how to create access keys.
169171

172+
### Azure Blob Storage {#azureblobstorage}
173+
Currently only protected buckets are supported for Azure Blob Storage. Authentication is done via a connection string, which supports access keys and shared keys. For more information, read [this guide](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string).
170174

171175
## F.A.Q. {#faq}
172176

Lines changed: 15 additions & 0 deletions
Loading

styles/ClickHouse/Headings.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,4 @@ exceptions:
3333
- ReplacingMergeTree
3434
- AggregatingMergeTree
3535
- DigitalOcean Spaces
36+
- Azure Blob Storage

0 commit comments

Comments
 (0)