-
Notifications
You must be signed in to change notification settings - Fork 265
RC: RDI In the Cloud #1066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RC: RDI In the Cloud #1066
Changes from 71 commits
c4d39e6
d586971
b5beb6a
10d5de8
c9e5161
42fcff9
17270d8
9e0de8e
efc8b0b
20c3050
c3b7bca
85c6d08
e938762
a5977a2
289f0e4
1d6fce4
272e518
9d12553
1551e1d
a32204e
07bc249
bc41625
71f431c
e8ea310
860944a
639fc50
697d665
fa87f86
49d1f54
4abac23
4d69274
129d6af
99229b1
c3e501e
01907d8
c9469e0
d56188a
f817848
8f274f5
2d2820d
46603a6
89e0c4f
962c1e9
2b110b4
a51743b
35c8dc5
52412cc
7ef44a2
d0bf623
07b45a0
ff2c061
79fee3a
40891de
375cf15
c8483de
ed0a757
5ff8b8b
c32ad98
1ecaa4d
892b6d4
e0b86c1
42d8b6c
3a464a8
e74ed1b
5aff527
51373ca
758fd25
234cff0
cfd68f7
83c07da
be32bcb
c8eeda6
8d14cf3
319344e
d625457
b1eb09b
5b7bb2d
b149acd
ee1bb0c
8dda0c9
f47daa9
b6b3688
ae7b8c0
742a58b
5700121
14f2d28
52d2c09
5201c3e
48cbdd6
fca0769
668028a
39787b4
91ab559
ed09e16
9137b85
d9b2abf
c6e4a3c
3987df2
8474da5
5ad31c7
21309d0
4453c24
0a0fa21
09207e4
ba1b5be
0c851e4
b45447f
960c870
8997f65
f2b24f7
0d10d41
d89c679
c7c0b48
fb6cd81
ef27ef6
85d6220
8a89cef
0a9c164
2437e22
5de7c93
0427b13
5fc5e4f
e22d829
42f5751
4d406a0
c05d2d2
5a543d0
78c37da
4b5d979
20e9d5d
ff650a3
dd25d85
2ea4141
c2a9097
82a854c
2a32e10
37d4eb7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
--- | ||
Title: Data Integration | ||
alwaysopen: false | ||
categories: | ||
- docs | ||
- operate | ||
- rc | ||
description: Use Redis Data Integration with Redis Cloud. | ||
hideListLinks: true | ||
weight: 99 | ||
--- | ||
|
||
Redis Cloud now supports [Redis Data Integration (RDI)]({{<relref "integrate/redis-data-integration">}}), a fast and simple way to bring your data into Redis from other types of primary databases. | ||
|
||
A relational database usually handles queries much more slowly than a Redis database. If your application uses a relational database and makes many more reads than writes (which is the typical case) then you can improve performance by using Redis as a cache to handle the read queries quickly. Redis Cloud uses [ingest]({{<relref "/integrate/redis-data-integration/">}}) to help you offload all read queries from the application database to Redis automatically. | ||
|
||
Using a data pipeline lets you have a cache that is always ready for queries. RDI Data pipelines ensure that any changes made to your primary database are captured in your Redis cache within a few seconds, preventing cache misses and stale data within the cache. | ||
|
||
RDI helps Redis customers sync Redis Cloud with live data from their primary databases to: | ||
- Meet the required speed and scale of read queries and provide an excellent and predictable user experience. | ||
- Save resources and time when building pipelines and coding data transformations. | ||
- Reduce the total cost of ownership by saving money on expensive database read replicas. | ||
|
||
Using RDI with Redis Cloud simplifies managing your data integration pipeline. No need to worry about hardware or underlying infrastructure, as Redis Cloud manages that for you. Creating the data flow from source to target is much easier, and there are validations in place to reduce errors. | ||
|
||
## Data pipeline architecture | ||
|
||
An RDI data pipeline sits between your source database and your target Redis database. Initially, the pipeline reads all of the data and imports it into the target database during the *initial sync* phase. After this initial sync is complete, the data pipeline enters the *streaming* phase, where changes are captured as they happen. Changes in the source database are added to the target within a few seconds of capture. The data pipeline translates relational database rows to Redis hashes or JSON documents. | ||
|
||
For more info on how RDI works, see [RDI Architecture]({{<relref "/integrate/redis-data-integration/architecture">}}). | ||
|
||
### Pipeline security | ||
|
||
Data pipelines are set up to ensure a high level of data security. Source database credentials and TLS secrets are stored in AWS secret manager and shared using the Kubernetes CSI driver for secrets. See [Share source database credentials]({{<relref "/operate/rc/databases/rdi/setup#share-source-database-credentials">}}) to learn how to share your source database credentials and TLS certificates with Redis Cloud. | ||
|
||
Connections to the source database use Java Database Connectivity (JDBC) through [AWS PrivateLink](https://aws.amazon.com/privatelink/), ensuring that the data pipeline is only exposed to the specific database endpoint. See [Set up connectivity]({{<relref "/operate/rc/databases/rdi/setup#set-up-connectivity">}}) to learn how to connect your PrivateLink to the Redis Cloud VPC. | ||
|
||
RDI encrypts all network connections with TLS. The pipeline will process data from the source database in-memory and write it to the target database using a TLS connection. There are no external connections to your data pipeline except from Redis Cloud management services. | ||
|
||
## Prerequisites | ||
|
||
Before you can create a data pipeline, you must have: | ||
|
||
- A [Redis Cloud Pro database]({{< relref "/operate/rc/databases/create-database/create-pro-database-new" >}}) hosted on Amazon Web Services (AWS). This will be the target database. | ||
- One supported source database, hosted on an AWS EC2 instance: | ||
|
||
{{< embed-md "rdi-supported-source-versions.md" >}} | ||
|
||
|
||
{{< note >}} | ||
Please be aware of the following limitations: | ||
|
||
- The target database must be a Redis Cloud Pro database hosted on Amazon Web Services (AWS). Redis Cloud Essentials databases and databases hosted on Google Cloud do not support Data Integration. | ||
- The target database must use multi-zone [high availability]({{< relref "/operate/rc/databases/configuration/high-availability" >}}). | ||
- The target database can use TLS, but can not use mutual TLS. | ||
- The target database cannot be in the same subscription as another database that has a data pipeline. | ||
- Source databases must also be hosted on AWS. | ||
- One source database can only be synced to one target database. | ||
cmilesb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- You must be able to set up AWS PrivateLink to connect your Source database to your target database. RDI only works with AWS PrivateLink and not VPC Peering or other private connectivity options. | ||
{{< /note >}} | ||
|
||
## Get started | ||
|
||
To create a new data pipeline, you need to: | ||
|
||
1. [Prepare your source database]({{<relref "/operate/rc/databases/rdi/setup">}}) and any associated credentials. | ||
2. [Define the source connection and data pipeline]({{<relref "/operate/rc/databases/rdi/define">}}) by selecting which tables to sync. | ||
|
||
Once your data pipeline is defined, you can [view and edit]({{<relref "/operate/rc/databases/rdi/view-edit">}}) it. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
--- | ||
Title: Define data pipeline | ||
alwaysopen: false | ||
categories: | ||
- docs | ||
- operate | ||
- rc | ||
description: Define the source connction and data pipeline. | ||
cmilesb marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
hideListLinks: true | ||
weight: 2 | ||
--- | ||
|
||
After you have [prepared your source database]({{<relref "/operate/rc/databases/rdi/setup">}}) and connection information, you can set up your new pipeline. To do this: | ||
|
||
1. [Define the source connection](#define-source-connection) by entering all required source database information. | ||
2. [Define the data pipeline](#define-data-pipeline) by selecting the data that you want to sync from your source database to the target database. | ||
|
||
## Define source connection | ||
|
||
1. In the [Redis Cloud console](https://cloud.redis.io/), go to your target database and select the **Data Pipeline** tab. | ||
1. Select **Define source database**. | ||
{{<image filename="images/rc/rdi/rdi-define-source-database.png" alt="The define source database button." width=200px >}} | ||
1. Enter a **Pipeline name**. This pipeline name will be the prefix to all keys generated by this pipeline in the target database. | ||
{{<image filename="images/rc/rdi/rdi-define-pipeline-cidr.png" alt="The pipeline name and deployment CIDR fields." >}} | ||
1. Enter the **Deployment CIDR** for your pipeline, or use the one generated for you. This CIDR should not conflict with your apps or other databases. | ||
1. In the **Source database connectivity** section, enter the **PrivateLink service name** of the [PrivateLink connected to your source database]({{< relref "/operate/rc/databases/rdi/setup#set-up-connectivity" >}}). | ||
{{<image filename="images/rc/rdi/rdi-define-connectivity.png" alt="The Source database connectivity section, with database connection details and connectivity options." >}} | ||
1. Enter your database details. This depends on your database type, and includes: | ||
- **Port**: The database's port | ||
- **Database**: Your database's name, or the root database *(PostgreSQL, Oracle only)*, or a comma-separated list of one or more databases you want to connect to *(SQL Server only)* | ||
- **Database Server ID**: Unique ID for the replication client. Leave as default if you don't use replication *(mySQL and mariaDB only)* | ||
cmilesb marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
- **PDB**: Name of the Oracle pluggable database *(Oracle only)* | ||
1. Enter the ARN of your [database credentials secret]({{< relref "/operate/rc/databases/rdi/setup#share-source-database-credentials" >}}) in the **Source database secrets ARN** field. | ||
1. Select **Start pipeline setup**. | ||
{{<image filename="images/rc/rdi/rdi-start-pipeline-setup.png" alt="The start pipeline setup button." width=200px >}} | ||
1. Redis Cloud will attempt to connect to PrivateLink. If your PrivateLink does not allow automatic acceptance of incoming connections, accept the incoming connection on AWS PrivateLink to proceed. See [Accept or Reject PrivateLink connection requests](https://docs.aws.amazon.com/vpc/latest/privatelink/configure-endpoint-service.html#accept-reject-connection-requests). | ||
|
||
If Redis Cloud can't find your PrivateLink connection, make sure that the PrivateLink service name is correct and that Redis Cloud is listed as an Allowed Principal for your VPC. See [Set up connectivity]({{<relref "/operate/rc/databases/rdi/setup#set-up-connectivity">}}) for more info. | ||
|
||
At this point, Redis Cloud will provision the pipeline infrastructure that will allow you to define your data pipeline. | ||
|
||
{{<image filename="images/rc/rdi/rdi-pipeline-setup-in-progress.png" alt="The Pipeline setup in progress screen." width=75% >}} | ||
|
||
Pipelines are provisioned in the background. You aren't allowed to make changes to your data pipeline or to your database during provisioning. This process will take a long time, so you can close the window and come back later. | ||
|
||
When your pipeline is provisioned, select **Complete setup**. You will then [define your data pipeline](#define-data-pipeline). | ||
|
||
{{<image filename="images/rc/rdi/rdi-complete-setup.png" alt="The complete setup button." width=200px >}} | ||
|
||
## Define data pipeline | ||
|
||
After your pipeline is provisioned, you will be able to define your pipeline. You will select the database schemas, tables, and columns that you want to import and synchronize with your primary database. | ||
|
||
### Configure a new pipeline | ||
|
||
1. In the [Redis Cloud console](https://cloud.redis.io/), go to your target database and select the **Data Pipeline** tab. If your pipeline is already provisioned, select **Complete setup** to go to the **Pipeline definition** section. | ||
{{<image filename="images/rc/rdi/rdi-complete-setup.png" alt="The complete setup button." width=200px >}} | ||
1. For the **Configure a new pipeline** option, select the Redis data type to write keys to the target. You can choose **Hash** or **JSON** if the target database supports JSON. | ||
{{<image filename="images/rc/rdi/rdi-configure-new-pipeline.png" alt="The Pipeline definition screen. Configure a new pipeline is selected." width=75% >}} | ||
Select **Continue**. | ||
{{<image filename="images/rc/rdi/rdi-continue-button.png" alt="The continue button." width=150px >}} | ||
1. Select the Schema and Tables you want to migrate to the target database from the **Source data selection** list. | ||
{{<image filename="images/rc/rdi/rdi-select-source-data.png" alt="The select source data section. " width=75% >}} | ||
|
||
You can select any number of columns from a table. | ||
|
||
{{<image filename="images/rc/rdi/rdi-select-columns.png" alt="The select source data section. A table is expanded with a few columns selected." width=75% >}} | ||
|
||
If any tables are missing a unique constraint, the **Missing unique constraint** list will appear. Select the columns that define a unique constraint for those tables from the list. | ||
|
||
{{<image filename="images/rc/rdi/rdi-missing-unique-constraint.png" alt="The missing unique constraint list." width=75% >}} | ||
|
||
{{<image filename="images/rc/rdi/rdi-select-constraints.png" alt="The missing unique constraint list with columns selected." width=75% >}} | ||
|
||
Select **Add schema** to add more database schemas. | ||
|
||
{{<image filename="images/rc/rdi/rdi-add-schema.png" alt="The add schema button." width=150px >}} | ||
|
||
Select **Delete** to delete a schema. You must have at least one schema to continue. | ||
|
||
{{<image filename="images/rc/rdi/rdi-delete-schema.png" alt="The delete schema button." width=50px >}} | ||
|
||
After you've selected the schemas and tables you want to sync, select **Continue**. | ||
|
||
{{<image filename="images/rc/rdi/rdi-continue-button.png" alt="The continue button." width=150px >}} | ||
|
||
1. Review the tables you selected in the **Summary**. If everything looks correct, select **Start ingest** to start ingesting data from your source database. | ||
|
||
{{<image filename="images/rc/rdi/rdi-start-ingest.png" alt="The start ingest button." width=175px >}} | ||
|
||
At this point, the data pipeline will ingest data from the source database to your target Redis database. This process will take time, especially if you have a lot of records in your source database. | ||
|
||
After this initial sync is complete, the data pipeline enters the *change streaming* phase, where changes are captured as they happen. Changes in the source database are added to the target within a few seconds of capture. | ||
|
||
You can view the status of your data pipeline in the **Data pipeline** tab of your database. See [View and edit data pipeline]({{<relref "/operate/rc/databases/rdi/view-edit">}}) to learn more. |
Uh oh!
There was an error while loading. Please reload this page.