Skip to content

Commit 7651fc2

Browse files
authored
updates (#1604)
1 parent e675c9c commit 7651fc2

File tree

4 files changed

+145
-122
lines changed

4 files changed

+145
-122
lines changed

docs/en/guides/40-load-data/02-load-db/addax.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,14 @@
22
title: Addax
33
---
44

5-
See [Addax](/guides/migrate/mysql#addax).
5+
[Addax](https://github.com/wgzhao/Addax), originally derived from Alibaba's [DataX](https://github.com/alibaba/DataX), is a versatile open-source ETL (Extract, Transform, Load) tool. It excels at seamlessly transferring data between diverse RDBMS (Relational Database Management Systems) and NoSQL databases, making it an optimal solution for efficient data migration.
6+
7+
For information about the system requirements, download, and deployment steps for Addax, refer to Addax's [Getting Started Guide](https://github.com/wgzhao/Addax#getting-started). The guide provides detailed instructions and guidelines for setting up and using Addax.
8+
9+
### DatabendReader & DatabendWriter
10+
11+
DatabendReader and DatabendWriter are integrated plugins of Addax, allowing seamless integration with Databend. The DatabendReader plugin enables reading data from Databend. Databend provides compatibility with the MySQL client protocol, so you can also use the [MySQLReader](https://wgzhao.github.io/Addax/develop/reader/mysqlreader/) plugin to retrieve data from Databend. For more information about DatabendReader, see https://wgzhao.github.io/Addax/develop/reader/databendreader/
12+
13+
### Tutorials
14+
15+
- [Migrating from MySQL with Addax](/tutorials/migrate/migrating-from-mysql-with-addax)

docs/en/guides/40-load-data/02-load-db/datax.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,22 @@
22
title: DataX
33
---
44

5-
See [DataX](/guides/migrate/mysql#datax).
5+
[DataX](https://github.com/alibaba/DataX) is an open-source data integration tool developed by Alibaba. It is designed to efficiently and reliably transfer data between various data storage systems and platforms, such as relational databases, big data platforms, and cloud storage services. DataX supports a wide range of data sources and data sinks, including but not limited to MySQL, Oracle, SQL Server, PostgreSQL, HDFS, Hive, HBase, MongoDB, and more.
6+
7+
:::tip
8+
[Apache DolphinScheduler](https://dolphinscheduler.apache.org/) now has added support for Databend as a data source. This enhancement enables you to leverage DolphinScheduler for managing DataX tasks and effortlessly load data from MySQL to Databend.
9+
:::
10+
11+
For information about the system requirements, download, and deployment steps for DataX, refer to DataX's [Quick Start Guide](https://github.com/alibaba/DataX/blob/master/userGuid.md). The guide provides detailed instructions and guidelines for setting up and using DataX.
12+
13+
### DatabendWriter
14+
15+
DatabendWriter is an integrated plugin of DataX, which means it comes pre-installed and does not require any manual installation. It acts as a seamless connector that enables the effortless transfer of data from other databases to Databend. With DatabendWriter, you can leverage the capabilities of DataX to efficiently load data from various databases into Databend.
16+
17+
DatabendWriter supports two operational modes: INSERT (default) and REPLACE. In INSERT Mode, new data is added while conflicts with existing records are prevented to maintain data integrity. On the other hand, the REPLACE Mode prioritizes data consistency by replacing existing records with newer data in case of conflicts.
18+
19+
If you need more information about DatabendWriter and its functionalities, you can refer to the documentation available at https://github.com/alibaba/DataX/blob/master/databendwriter/doc/databendwriter.md
20+
21+
### Tutorials
22+
23+
- [Migrating from MySQL with DataX](/tutorials/migrate/migrating-from-mysql-with-datax)

docs/en/guides/40-load-data/02-load-db/debezium.md

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,93 @@
22
title: Debezium
33
---
44

5-
See [Debezium](/guides/migrate/mysql#debezium).
5+
[Debezium](https://debezium.io/) is a set of distributed services to capture changes in your databases so that your applications can see those changes and respond to them. Debezium records all row-level changes within each database table in a change event stream, and applications simply read these streams to see the change events in the same order in which they occurred.
6+
7+
[debezium-server-databend](https://github.com/databendcloud/debezium-server-databend) is a lightweight CDC tool developed by Databend, based on Debezium Engine. Its purpose is to capture real-time changes in relational databases and deliver them as event streams to ultimately write the data into the target database Databend. This tool provides a simple way to monitor and capture database changes, transforming them into consumable events without the need for large data infrastructures like Flink, Kafka, or Spark.
8+
9+
### Installing debezium-server-databend
10+
11+
debezium-server-databend can be installed independently without the need for installing Debezium beforehand. Once you have decided to install debezium-server-databend, you have two options available. The first one is to install it from source by downloading the source code and building it yourself. Alternatively, you can opt for a more straightforward installation process using Docker.
12+
13+
#### Installing debezium-server-databend from Source
14+
15+
Before you start, make sure JDK 11 and Maven are installed on your system.
16+
17+
1. Clone the project:
18+
19+
```bash
20+
git clone https://github.com/databendcloud/debezium-server-databend.git
21+
```
22+
23+
2. Change into the project's root directory:
24+
25+
```bash
26+
cd debezium-server-databend
27+
```
28+
29+
3. Build and package debezium server:
30+
31+
```go
32+
mvn -Passembly -Dmaven.test.skip package
33+
```
34+
35+
4. Once the build is completed, unzip the server distribution package:
36+
37+
```bash
38+
unzip debezium-server-databend-dist/target/debezium-server-databend-dist*.zip -d databendDist
39+
```
40+
41+
5. Enter the extracted folder:
42+
43+
```bash
44+
cd databendDist
45+
```
46+
47+
6. Create a file named _application.properties_ in the _conf_ folder with the content in the sample [here](https://github.com/databendcloud/debezium-server-databend/blob/main/debezium-server-databend-dist/src/main/resources/distro/conf/application.properties.example), and modify the configurations according to your specific requirements. For description of the available parameters, see this [page](https://github.com/databendcloud/debezium-server-databend/blob/main/docs/docs.md).
48+
49+
```bash
50+
nano conf/application.properties
51+
```
52+
53+
7. Use the provided script to start the tool:
54+
55+
```bash
56+
bash run.sh
57+
```
58+
59+
#### Installing debezium-server-databend with Docker
60+
61+
Before you start, make sure Docker and Docker Compose are installed on your system.
62+
63+
1. Create a file named _application.properties_ in the _conf_ folder with the content in the sample [here](https://github.com/databendcloud/debezium-server-databend/blob/main/debezium-server-databend-dist/src/main/resources/distro/conf/application.properties.example), and modify the configurations according to your specific requirements. For description of the available Databend parameters, see this [page](https://github.com/databendcloud/debezium-server-databend/blob/main/docs/docs.md).
64+
65+
```bash
66+
nano conf/application.properties
67+
```
68+
69+
2. Create a file named _docker-compose.yml_ with the following content:
70+
71+
```dockerfile
72+
version: '2.1'
73+
services:
74+
debezium:
75+
image: ghcr.io/databendcloud/debezium-server-databend:pr-2
76+
ports:
77+
- "8080:8080"
78+
- "8083:8083"
79+
volumes:
80+
- $PWD/conf:/app/conf
81+
- $PWD/data:/app/data
82+
```
83+
84+
3. Open a terminal or command-line interface and navigate to the directory containing the _docker-compose.yml_ file.
85+
86+
4. Use the following command to start the tool:
87+
88+
```bash
89+
docker-compose up -d
90+
```
91+
92+
### Tutorials
93+
94+
- [Migrating from MySQL with Debezium](/tutorials/migrate/migrating-from-mysql-with-debezium)

docs/en/guides/41-migrate/mysql.md

Lines changed: 25 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -4,136 +4,42 @@ title: MySQL
44

55
This guide introduces how to migrate data from MySQL to Databend. Databend supports two main migration approaches: batch loading and continuous data sync.
66

7-
## Batch Loading
8-
9-
To migrate data from MySQL to Databend in batches, you can use tools such as Addax or DataX.
10-
11-
### Addax
12-
13-
[Addax](https://github.com/wgzhao/Addax), originally derived from Alibaba's [DataX](https://github.com/alibaba/DataX), is a versatile open-source ETL (Extract, Transform, Load) tool. It excels at seamlessly transferring data between diverse RDBMS (Relational Database Management Systems) and NoSQL databases, making it an optimal solution for efficient data migration.
14-
15-
For information about the system requirements, download, and deployment steps for Addax, refer to Addax's [Getting Started Guide](https://github.com/wgzhao/Addax#getting-started). The guide provides detailed instructions and guidelines for setting up and using Addax.
16-
17-
#### DatabendReader & DatabendWriter
18-
19-
DatabendReader and DatabendWriter are integrated plugins of Addax, allowing seamless integration with Databend. The DatabendReader plugin enables reading data from Databend. Databend provides compatibility with the MySQL client protocol, so you can also use the [MySQLReader](https://wgzhao.github.io/Addax/develop/reader/mysqlreader/) plugin to retrieve data from Databend. For more information about DatabendReader, see https://wgzhao.github.io/Addax/develop/reader/databendreader/
20-
21-
### DataX
22-
23-
[DataX](https://github.com/alibaba/DataX) is an open-source data integration tool developed by Alibaba. It is designed to efficiently and reliably transfer data between various data storage systems and platforms, such as relational databases, big data platforms, and cloud storage services. DataX supports a wide range of data sources and data sinks, including but not limited to MySQL, Oracle, SQL Server, PostgreSQL, HDFS, Hive, HBase, MongoDB, and more.
24-
25-
:::tip
26-
[Apache DolphinScheduler](https://dolphinscheduler.apache.org/) now has added support for Databend as a data source. This enhancement enables you to leverage DolphinScheduler for managing DataX tasks and effortlessly load data from MySQL to Databend.
27-
:::
28-
29-
For information about the system requirements, download, and deployment steps for DataX, refer to DataX's [Quick Start Guide](https://github.com/alibaba/DataX/blob/master/userGuid.md). The guide provides detailed instructions and guidelines for setting up and using DataX.
30-
31-
#### DatabendWriter
32-
33-
DatabendWriter is an integrated plugin of DataX, which means it comes pre-installed and does not require any manual installation. It acts as a seamless connector that enables the effortless transfer of data from other databases to Databend. With DatabendWriter, you can leverage the capabilities of DataX to efficiently load data from various databases into Databend.
34-
35-
DatabendWriter supports two operational modes: INSERT (default) and REPLACE. In INSERT Mode, new data is added while conflicts with existing records are prevented to maintain data integrity. On the other hand, the REPLACE Mode prioritizes data consistency by replacing existing records with newer data in case of conflicts.
36-
37-
If you need more information about DatabendWriter and its functionalities, you can refer to the documentation available at https://github.com/alibaba/DataX/blob/master/databendwriter/doc/databendwriter.md
38-
39-
## Continuous Sync with CDC
40-
41-
To migrate data from MySQL to Databend in real-time, you can use Change Data Capture (CDC) tools such as Debezium or Flink CDC.
42-
43-
### Debezium
44-
45-
[Debezium](https://debezium.io/) is a set of distributed services to capture changes in your databases so that your applications can see those changes and respond to them. Debezium records all row-level changes within each database table in a change event stream, and applications simply read these streams to see the change events in the same order in which they occurred.
46-
47-
[debezium-server-databend](https://github.com/databendcloud/debezium-server-databend) is a lightweight CDC tool developed by Databend, based on Debezium Engine. Its purpose is to capture real-time changes in relational databases and deliver them as event streams to ultimately write the data into the target database Databend. This tool provides a simple way to monitor and capture database changes, transforming them into consumable events without the need for large data infrastructures like Flink, Kafka, or Spark.
48-
49-
debezium-server-databend can be installed independently without the need for installing Debezium beforehand. Once you have decided to install debezium-server-databend, you have two options available. The first one is to install it from source by downloading the source code and building it yourself. Alternatively, you can opt for a more straightforward installation process using Docker.
50-
51-
#### Installing debezium-server-databend from Source
52-
53-
Before you start, make sure JDK 11 and Maven are installed on your system.
54-
55-
1. Clone the project:
7+
| Migration Approach | Recommended Tool | Supported MySQL versions |
8+
|--------------------------|------------------------------|--------------------------|
9+
| Batch Loading | db-archiver | All MySQL versions |
10+
| Continuous Sync with CDC | Apache Flink CDC (16.1–17.1) | MySQL 8.0 or below |
5611

57-
```bash
58-
git clone https://github.com/databendcloud/debezium-server-databend.git
59-
```
60-
61-
2. Change into the project's root directory:
62-
63-
```bash
64-
cd debezium-server-databend
65-
```
66-
67-
3. Build and package debezium server:
68-
69-
```go
70-
mvn -Passembly -Dmaven.test.skip package
71-
```
72-
73-
4. Once the build is completed, unzip the server distribution package:
74-
75-
```bash
76-
unzip debezium-server-databend-dist/target/debezium-server-databend-dist*.zip -d databendDist
77-
```
78-
79-
5. Enter the extracted folder:
80-
81-
```bash
82-
cd databendDist
83-
```
84-
85-
6. Create a file named _application.properties_ in the _conf_ folder with the content in the sample [here](https://github.com/databendcloud/debezium-server-databend/blob/main/debezium-server-databend-dist/src/main/resources/distro/conf/application.properties.example), and modify the configurations according to your specific requirements. For description of the available parameters, see this [page](https://github.com/databendcloud/debezium-server-databend/blob/main/docs/docs.md).
86-
87-
```bash
88-
nano conf/application.properties
89-
```
90-
91-
7. Use the provided script to start the tool:
92-
93-
```bash
94-
bash run.sh
95-
```
12+
## Batch Loading
9613

97-
#### Installing debezium-server-databend with Docker
14+
Databend recommends using db-archiver for batch migration from MySQL.
9815

99-
Before you start, make sure Docker and Docker Compose are installed on your system.
16+
### db-archiver
10017

101-
1. Create a file named _application.properties_ in the _conf_ folder with the content in the sample [here](https://github.com/databendcloud/debezium-server-databend/blob/main/debezium-server-databend-dist/src/main/resources/distro/conf/application.properties.example), and modify the configurations according to your specific requirements. For description of the available Databend parameters, see this [page](https://github.com/databendcloud/debezium-server-databend/blob/main/docs/docs.md).
18+
[db-archiver](https://github.com/databendcloud/db-archiver) is a migration tool developed by Databend that supports migrating data from various databases, including all versions of MySQL.
10219

103-
```bash
104-
nano conf/application.properties
105-
```
20+
To install db-archiver:
10621

107-
2. Create a file named _docker-compose.yml_ with the following content:
108-
109-
```dockerfile
110-
version: '2.1'
111-
services:
112-
debezium:
113-
image: ghcr.io/databendcloud/debezium-server-databend:pr-2
114-
ports:
115-
- "8080:8080"
116-
- "8083:8083"
117-
volumes:
118-
- $PWD/conf:/app/conf
119-
- $PWD/data:/app/data
22+
```shell
23+
go install github.com/databendcloud/db-archiver/cmd@latest
12024
```
12125

122-
3. Open a terminal or command-line interface and navigate to the directory containing the _docker-compose.yml_ file.
26+
For more details about db-archiver, visit the [GitHub repository](https://github.com/databendcloud/db-archiver). To see how it works in practice, check out this tutorial: [Migrating from MySQL with db-archiver](/tutorials/migrate/migrating-from-mysql-with-db-archiver).
12327

124-
4. Use the following command to start the tool:
28+
## Continuous Sync with CDC
12529

126-
```bash
127-
docker-compose up -d
128-
```
30+
Databend recommends using Flink CDC for real-time CDC migration from MySQL.
12931

13032
### Flink CDC
13133

132-
[Apache Flink](https://github.com/apache/flink) CDC (Change Data Capture) refers to the capability of Apache Flink to capture and process real-time data changes from various sources using SQL-based queries. CDC allows you to monitor and capture data modifications (inserts, updates, and deletes) happening in a database or streaming system and react to those changes in real time. You can utilize the [Flink SQL connector for Databend](https://github.com/databendcloud/flink-connector-databend) to load data from other databases in real-time into Databend. The Flink SQL connector for Databend offers a connector that integrates Flink's stream processing capabilities with Databend. By configuring this connector, you can capture data changes from various databases as streams and load them into Databend for processing and analysis in real-time.
34+
[Apache Flink](https://github.com/apache/flink) CDC (Change Data Capture) refers to the capability of Apache Flink to capture and process real-time data changes from various sources using SQL-based queries. CDC allows you to monitor and capture data modifications (inserts, updates, and deletes) happening in a database or streaming system and react to those changes in real time.
35+
36+
You can utilize the [Flink SQL connector for Databend](https://github.com/databendcloud/flink-connector-databend) to load data from other databases in real-time into Databend. The Flink SQL connector for Databend offers a connector that integrates Flink's stream processing capabilities with Databend. By configuring this connector, you can capture data changes from various databases as streams and load them into Databend for processing and analysis in real-time.
13337

134-
#### Downloading & Installing Connector
38+
- Only Apache Flink CDC versions 16.1 to 17.1 are supported for migrating from MySQL to Databend.
39+
- Supports migration only from MySQL version 8.0 or below.
40+
- Flink Databend Connector requires Java 8 or 11.
13541

136-
To download and install the Flink SQL connector for Databend, follow these steps:
42+
To download and install the Flink SQL connector for Databend:
13743

13844
1. Download and set up Flink: Before installing the Flink SQL connector for Databend, ensure that you have downloaded and set up Flink on your system. You can download Flink from the official website: https://flink.apache.org/downloads/
13945

@@ -147,11 +53,11 @@ To download and install the Flink SQL connector for Databend, follow these steps
14753
mvn clean install -DskipTests
14854
```
14955

150-
3. Move the JAR file: Once you have downloaded the connector, move the JAR file to the lib folder in your Flink installation directory. For example, if you have Flink version 1.16.0 installed, move the JAR file to the flink-1.16.0/lib/ directory.
56+
3. Move the JAR file: Once you have downloaded the connector, move the JAR file to the lib folder in your Flink installation directory. For example, if you have Flink version 1.16.1 installed, move the JAR file to the `flink-1.16.1/lib/` directory.
57+
58+
For more details about the Flink SQL connector for Databend, visit the [GitHub repository](https://github.com/databendcloud/flink-connector-databend). To see how it works in practice, check out this tutorial: [Migrating from MySQL with Flink CDC](/tutorials/migrate/migrating-from-mysql-with-flink-cdc).
15159

15260
## Tutorials
15361

154-
- [Migrating from MySQL with Addax](/tutorials/migrate/migrating-from-mysql-with-addax)
155-
- [Migrating from MySQL with DataX](/tutorials/migrate/migrating-from-mysql-with-datax)
156-
- [Migrating from MySQL with Debezium](/tutorials/migrate/migrating-from-mysql-with-debezium)
62+
- [Migrating from MySQL with db-archiver](/tutorials/migrate/migrating-from-mysql-with-db-archiver)
15763
- [Migrating from MySQL with Flink CDC](/tutorials/migrate/migrating-from-mysql-with-flink-cdc)

0 commit comments

Comments
 (0)