Skip to content

Commit 41cfb69

Browse files
authored
Add Integrations section (#92)
1 parent 77aaf8f commit 41cfb69

File tree

4 files changed

+303
-0
lines changed

4 files changed

+303
-0
lines changed
154 KB
Loading

docs/cluster/integrations.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
(cluster-integrations)=
2+
# Integrations
3+
4+
CrateDB Cloud simplifies data ingestion with fully managed integrations from
5+
external data sources. Unlike traditional import jobs, integrations run
6+
continuously, automatically importing new data into your CrateDB Cloud cluster
7+
as it becomes available. This makes them ideal for real-time data
8+
synchronization and keeping your database up to date with external systems.
9+
Fully managed by CrateDB Cloud, integrations eliminate the need for manual
10+
setup or maintenance of separate ETL pipelines.
11+
12+
13+
```{figure} ../_assets/img/integrations-example.png
14+
:width: 600px
15+
:align: center
16+
:alt: Integration in CrateDB Cloud
17+
```
18+
19+
:::{toctree}
20+
:maxdepth: 1
21+
:hidden:
22+
23+
MongoDB CDC (Preview) <integrations/mongo-cdc>
24+
:::
25+
26+
---
27+
28+
:::
29+
## Key Concepts
30+
:::
31+
32+
:::
33+
### Integration
34+
:::
35+
36+
An integration in CrateDB Cloud automatically imports data from an external data
37+
source into a table within your CrateDB Cloud cluster. It uses a secure
38+
connection to ensure data privacy and can run continuously to handle updates
39+
in real time. You can create multiple integrations for the same data source to
40+
support different tables or configurations.
41+
42+
Currently, CrateDB Cloud supports ingestion from the following data source:
43+
- {ref}`MongoDB CDC (Preview) <integrations-mongo-cdc>`
44+
45+
More integrations are planned for future releases to expand the range of
46+
supported data sources and use cases.
47+
48+
:::
49+
### Connection
50+
:::
51+
52+
A "Connection" in CrateDB Cloud associates authentication credentials with a
53+
specific data source. This allows secure access to the external system and is
54+
reusable across multiple integrations. By setting up a connection once, you can
55+
streamline the process of creating and managing integrations without having to
56+
re-enter credentials for each one.
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
(integrations-mongo-cdc)=
2+
# MongoDB CDC (Preview)
3+
4+
CrateDB Cloud enables continuous data ingestion from MongoDB using Change Data
5+
Capture (CDC), providing seamless, real-time synchronization of your data.
6+
7+
:::{caution}
8+
This integration is currently in preview and may have restricted availability.
9+
For more information, please [contact us](https://cratedb.com/contact).
10+
:::
11+
12+
## Key Concepts
13+
14+
The MongoDB CDC integration in CrateDB Cloud allows you to keep your data
15+
synchronized between your MongoDB Atlas cluster and your CrateDB Cloud cluster
16+
in real-time.
17+
18+
### How It Works
19+
20+
The integration functions in two main stages:
21+
22+
1. **Initial Sync:**
23+
The integration performs a complete scan of your MongoDB collections,
24+
importing all existing data into your CrateDB Cloud cluster.
25+
26+
2. **Continuous Sync:**
27+
The integration uses MongoDB Change Streams to monitor changes in your
28+
MongoDB collections and syncs these updates to your CrateDB Cloud cluster
29+
in real-time, ensuring that your data remains current.
30+
31+
### Data Consistency and Mode
32+
33+
For continuous sync, CrateDB Cloud uses MongoDB's **full document mode** to
34+
ensure data consistency. This mode guarantees that MongoDB returns the latest
35+
majority-committed version of the updated document.
36+
37+
While receiving partial deltas is more efficient, full document mode provides
38+
robust functionality. Support for partial deltas may be added in the future to
39+
enhance performance and flexibility.
40+
41+
---
42+
43+
## Create a new Integration
44+
A MongoDB integration allows you to sync a single collection from a MongoDB
45+
Atlas cluster. You can reuse an existing connection across multiple integrations
46+
to continuously sync data from multiple MongoDB Atlas collections.
47+
48+
Supported authentication methods:
49+
- MongoDB SCRAM Authentication
50+
- MongoDB X.509 Authentication
51+
52+
53+
### Set Up MongoDB Atlas Authentication
54+
55+
The following steps should be performed in the MongoDB Atlas UI.
56+
57+
#### Step 1: Create a Custom Role
58+
1. **Navigate to Database Access**
59+
Go to **Database Access** in the MongoDB Atlas UI for the cluster you want to
60+
connect to CrateDB Cloud.
61+
62+
2. **Add a Custom Role**
63+
Under **Custom Roles**, click **Add New Custom Role**.
64+
65+
3. **Set Up Read-Only Access**
66+
Assign the following actions or roles to the custom role:
67+
- `find`
68+
- `changeStream`
69+
- `collStats`
70+
71+
Specify the databases and collections for these actions. You can update
72+
access permissions in the MongoDB Atlas UI later if needed.
73+
74+
75+
#### Step 2: Create a User
76+
77+
Depending on whether you plan to use SCRAM or X.509 authentication, create a
78+
database user with one of the following methods:
79+
80+
:::{tab} SCRAM Auhentication
81+
82+
1. **Navigate to Database Access**
83+
In the MongoDB Atlas UI, go to **Database Access** and click **Add New
84+
Database User**.
85+
86+
2. **Set Authentication Method**
87+
Choose **Password** as the authentication method and enter a username and
88+
password for the database user.
89+
90+
3. **Assign the Role**
91+
Under **Database User Privileges**, select the custom role created in Step 1.
92+
93+
4. **Copy User Credentials**
94+
Click **Add User**, and make sure to record the username and password. These
95+
credentials will be used later in the CrateDB Cloud Console.
96+
:::
97+
98+
:::{tab} x.509 Authentication
99+
100+
1. **Navigate to Database Access**
101+
In the MongoDB Atlas UI, go to **Database Access** and click **Add New
102+
Database User**.
103+
104+
2. **Set Authentication Method**
105+
Choose **Certificate** as the authentication method.
106+
107+
3. **Assign the Role**
108+
Under **Database User Privileges**, select the custom role created in Step 1.
109+
110+
4. **Save the Certificate**
111+
Click **Add User**, and store the certificate securely. This will be required
112+
later in the CrateDB Cloud Console.
113+
:::
114+
115+
116+
#### Step 3: Configure IP Access
117+
118+
To allow CrateDB Cloud to access your MongoDB Atlas cluster, you must add the
119+
CrateDB Cloud IP addresses to the IP Access List in MongoDB Atlas.
120+
121+
1. **Navigate to Network Access**
122+
In the MongoDB Atlas UI, go to **Network Access** from the left navigation.
123+
124+
2. **Add IP Address**
125+
Click **Add IP Address** and choose an IP address or range to allow access.
126+
For testing purposes, you can select **Allow Access from Anywhere**, but for
127+
production, it is recommended to specify only the required IPs.
128+
129+
:::{note}
130+
The specific IP addresses depend on the region of your CrateDB Cloud cluster.
131+
These IP addresses can also be found in the **Connection Details** section of the
132+
CrateDB Cloud Console, just before you click **Test Connection** during the
133+
setup process.
134+
135+
**Outbound IP Addresses**:
136+
137+
| Cloud Provider | Region | IP Addresses |
138+
|----------------|---------------|---------------------------------|
139+
| Azure | East US 2 | `52.184.241.228/32`, `52.254.31.90/32` |
140+
| Azure | West Europe | `51.105.153.175/32`, `108.142.34.5/32` |
141+
| AWS | EU West 1 | `34.255.75.224` |
142+
| AWS | US East 1 | `54.197.229.58` |
143+
| AWS | US West 2 | `54.189.16.20` |
144+
| GCP | US Central 1 | `34.69.134.49` |
145+
146+
:::
147+
148+
:::{note}
149+
To set up a PrivateLink connection for the Mongo CDC integration, please reach
150+
out to our support team.
151+
:::
152+
153+
154+
#### Step 4: Access Connection String
155+
156+
You’ll need to provide the connection string for your MongoDB Atlas cluster so
157+
that CrateDB Cloud can connect to it.
158+
159+
1. **Navigate to Your Cluster**
160+
In the MongoDB Atlas UI, navigate to the cluster you want to connect to CrateDB Cloud.
161+
162+
2. **Click "Connect"**
163+
From the cluster view, click on **Connect**.
164+
165+
3. **Select "Connect Your Application"**
166+
Choose **Connect your application** as the connection method.
167+
168+
4. **Copy the Connection String**
169+
Copy the connection string provided in the MongoDB Atlas UI. It will look like this:
170+
171+
```
172+
mongodb+srv://:@/?retryWrites=true&w=majority
173+
```
174+
175+
---
176+
177+
:::{note}
178+
If you are using X.509 authentication, the connection string will look slightly
179+
different and will not include a username and password. Instead, it will
180+
reference the certificate file:
181+
182+
```
183+
mongodb+srv:///?authMechanism=MONGODB-X509&retryWrites=true&w=majority
184+
```
185+
186+
Make sure to upload the X.509 certificate file when configuring the connection
187+
in CrateDB Cloud.
188+
:::
189+
190+
191+
192+
### Set Up Integration in CrateDB Cloud
193+
194+
Follow these steps in the CrateDB Cloud Console to set up the MongoDB CDC integration:
195+
196+
#### Step 1: Create an Integration
197+
1. Navigate to the **Import** section in the CrateDB Cloud Console.
198+
2. Click **Create Integration** and select **MongoDB** as the source type.
199+
200+
#### Step 2: Configure Connection
201+
1. Choose **Create New Connection** or select an existing one.
202+
2. Fill in the following details:
203+
:::{tab} SCRAM Auhentication
204+
- **Connection Name**: Provide a unique name for the connection.
205+
- **Connection String**: Paste the connection string from MongoDB Atlas.
206+
- **Username**: Enter the database username (required for SCRAM).
207+
- **Password**: Enter the database password (required for SCRAM).
208+
- **Default Database**: Specify the default database to use for this connection.
209+
:::
210+
:::{tab} X.509 Auhentication
211+
- **Connection Name**: Provide a unique name for the connection.
212+
- **Connection String**: Paste the connection string from MongoDB Atlas.
213+
- **Certificate**: Upload the X.509 certificate file.
214+
- **Default Database**: Specify the default database to use for this connection.
215+
:::
216+
217+
#### Step 3: Test the Connection
218+
Click **Test Connection** to verify CrateDB Cloud can connect to your MongoDB
219+
Atlas cluster. Resolve any issues if the test fails.
220+
221+
#### Step 4: Select Collection
222+
Enter the database and collection name from your MongoDB Atlas cluster, that you
223+
want to sync with CrateDB Cloud.
224+
225+
#### Step 5: Select Target Table
226+
1. Specify the target table in your CrateDB Cloud cluster where the data will be synced.
227+
2. MongoDB records will be inserted into an object column called `document`.
228+
3. Select the object type for the column:
229+
- **`dynamic`**: Allows indexing and columnar storage for faster querying.
230+
- **`ignored`**: Prevents type conflicts in CrateDB if your source data lacks a strict schema.
231+
232+
:::{note}
233+
If your source data doesn't follow a strict schema, select `ignored` to avoid type conflicts.
234+
However, selecting `dynamic` provides faster query performance by utilizing indexes and columnar storage.
235+
:::
236+
237+
#### Step 6: Configure Integration Settings
238+
1. Enter a name for the integration.
239+
2. Select the integration mode:
240+
- **Full Load Only**: Imports the data once but doesn’t sync changes.
241+
- **Full Load and CDC**: Imports the data and syncs changes in real-time.
242+
- **CDC Only**: Syncs only new changes in real-time without importing existing data.
243+
244+
#### Step 7: Create the Integration
245+
Click **Create Integration** to finalize the setup. CrateDB Cloud will now sync
246+
your MongoDB data based on the selected settings.

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,7 @@ Services <reference/services>
149149
Import <cluster/import>
150150
Console <cluster/console>
151151
Automation <cluster/automation>
152+
Integrations <cluster/integrations>
152153
Export <cluster/export>
153154
Backups <cluster/backups>
154155
Manage Cluster <cluster/manage>

0 commit comments

Comments
 (0)