Skip to content

Commit 3121202

Browse files
committed
Add ClickHouse Spark installation guide
1 parent e20ea3b commit 3121202

File tree

1 file changed

+44
-5
lines changed
  • docs/integrations/data-ingestion/aws-glue

1 file changed

+44
-5
lines changed

docs/integrations/data-ingestion/aws-glue/index.md

Lines changed: 44 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,59 @@ sidebar_label: 'Amazon Glue'
33
sidebar_position: 1
44
slug: /integrations/glue
55
description: 'Integrate ClickHouse and Amazon Glue'
6-
keywords: ['clickhouse', 'amazon', 'aws', 'glue', 'migrating', 'data']
7-
title: 'Integrating Amazon Glue with ClickHouse'
6+
keywords: ['clickhouse', 'amazon', 'aws', 'glue', 'migrating', 'data', 'spark']
7+
title: 'Integrating Amazon Glue with ClickHouse and Spark'
88
---
9-
9+
import Image from '@theme/IdealImage';
1010
import Tabs from '@theme/Tabs';
1111
import TabItem from '@theme/TabItem';
12+
import notebook_connections_config from '@site/static/images/integrations/data-ingestion/aws-glue/notebook-connections-config.png';
13+
import dependent_jars_path_option from '@site/static/images/integrations/data-ingestion/aws-glue/dependent_jars_path_option.png';
1214

13-
# Integrating Amazon Glue with ClickHouse
15+
# Integrating Amazon Glue with ClickHouse and Spark
1416

1517
[Amazon Glue](https://aws.amazon.com/glue/) is a fully managed, serverless data integration service provided by Amazon Web Services (AWS). It simplifies the process of discovering, preparing, and transforming data for analytics, machine learning, and application development.
1618

19+
# Installation
20+
21+
To integrate your Glue code with ClickHouse, you can use our official Spark connector in Glue via one of the following:
22+
- Installing the ClickHouse Glue connector from the AWS Marketplace (recommended).
23+
- Manually adding the Spark Connector's jars to your Glue job.
24+
25+
<Tabs>
26+
<TabItem value="AWS Marketplace" label="AWS Marketplace" default>
27+
28+
1. ### Subscribe to the Connector
29+
To access the connector in your account, subscribe to the ClickHouse AWS Glue Connector from AWS Marketplace.
30+
31+
2. ### Grant Required Permissions
32+
Ensure your Glue job’s IAM role has the necessary permissions, as described in the minimum privileges [guide](https://docs.aws.amazon.com/glue/latest/dg/getting-started-min-privs-job.html#getting-started-min-privs-connectors).
33+
34+
3. ### Activate the Connector & Create a Connection
35+
You can activate the connector and create a connection directly by clicking [this link](https://console.aws.amazon.com/gluestudio/home#/connector/add-connection?connectorName="ClickHouse%20AWS%20Glue%20Connector"&connectorType="Spark"&connectorUrl=https://709825985650.dkr.ecr.us-east-1.amazonaws.com/clickhouse/clickhouse-glue:0.1&connectorClassName="com.clickhouse.spark.ClickHouseCatalog"), which opens the Glue connector creation page with key fields pre-filled.
36+
37+
4. ### Set Up a Connection
38+
Create a new Glue connection using the connector, providing your ClickHouse JDBC URL and credentials.
39+
40+
5. ### Use in Glue Job
41+
In your Glue job, select the `Job details` tab, and expend the `Advanced properties` window. Under the `Connections` section, select the connection you just created. The connector automatically injects the required JARs into the job runtime.
42+
43+
<Image img={notebook_connections_config} size='md' alt='Glue Notebook connections config' />
44+
45+
</TabItem>
46+
<TabItem value="Manual Installation" label="Manual Installation">
47+
To add the required jars manually, please follow the following:
48+
1. Upload the following jars to an S3 bucket - `clickhouse-jdbc-0.6.X-all.jar` and `clickhouse-spark-runtime-3.X_2.X-0.8.X.jar`.
49+
2. Make sure the Glue job has access to this bucket.
50+
3. Under the `Job details` tab, scroll down and expend the `Advanced properties` drop down, and fill the jars path in `Dependent JARs path`:
51+
52+
<Image img={dependent_jars_path_option} size='md' alt='Glue Notebook JAR path options' />
53+
54+
</TabItem>
55+
</Tabs>
1756

18-
Although there is no Glue ClickHouse connector available yet, the official JDBC connector can be leveraged to connect and integrate with ClickHouse:
1957

58+
## Example
2059
<Tabs>
2160
<TabItem value="Java" label="Java" default>
2261

0 commit comments

Comments
 (0)