Skip to content

Commit 0099b22

Browse files
committed
revamp readme and architecture diagram (#5)
1 parent 284e539 commit 0099b22

File tree

3 files changed

+115
-55
lines changed

3 files changed

+115
-55
lines changed

README.md

Lines changed: 115 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,86 @@
1-
# Sample Application showcasing how to use DMS to create CDC
1+
# Real-time Database Replication with DMS and Kinesis on LocalStack
2+
3+
| Key | Value |
4+
| ------------ | ---------------------------------------------------------------------------------------- |
5+
| Environment | LocalStack, AWS |
6+
| Services | DMS, RDS, Kinesis, VPC, Secrets Manager |
7+
| Integrations | AWS CDK, Docker Compose, AWS SDK for Python |
8+
| Categories | Database Migration, Change Data Capture, Streaming |
9+
| Level | Intermediate |
10+
| Use Case | Database Migration, Real-time Data Replication, CDC Implementation |
11+
| GitHub | [Repository link](https://github.com/localstack-samples/sample-dms-cdc-rds-to-kinesis) |
212

313
## Introduction
414

5-
This scenario demonstrates how to use Database Migration Service (DMS) to create change data capture (CDC) and full load tasks using the Cloud Development Kit in Python. It is a self-contained setup that will create a VPC to host 2 databases, a Kinesis stream, and 4 replication tasks.
15+
This sample demonstrates how to use AWS Database Migration Service (DMS) to create change data capture (CDC) and full load replication tasks using the AWS Cloud Development Kit in Python. The application showcases real-time data replication from MariaDB databases to Kinesis streams, enabling you to capture and stream database changes as they occur. It is a self-contained setup that will create a local VPC to host 2 databases, a Kinesis stream, and 4 replication tasks. To test this application sample, we will demonstrate how you use LocalStack to deploy the complete DMS infrastructure on your developer machine and validate the data replication workflow locally. This provides a cost-effective way to develop and test database migration patterns before deploying to production AWS environments.
616

7-
![dms-mariadb-to-kinesis](./dms-mariadb-to-kinesis.jpg)
17+
## Architecture
818

9-
## Pre-requisites
19+
The following diagram shows the architecture that this sample application builds and deploys:
1020

11-
- [LocalStack Auth Token](https://docs.localstack.cloud/getting-started/auth-token/)
12-
- [Python 3.10](https://www.python.org/downloads/) & `pip`
13-
- [Docker Compose](https://docs.docker.com/compose/install/)
14-
- [CDK](https://docs.localstack.cloud/user-guide/integrations/aws-cdk/) with the [`cdklocal`](https://github.com/localstack/aws-cdk-local) wrapper.
21+
![dms-mariadb-to-kinesis](./images/architecture.png)
1522

16-
17-
Start LocalStack Pro with the `LOCALSTACK_AUTH_TOKEN` pre-configured:
23+
- [VPC](https://docs.localstack.cloud/aws/services/ec2/#vpc) with custom networking to host database resources
24+
- [RDS MariaDB instance](https://docs.localstack.cloud/aws/services/rds/) as the target database for CDC replication
25+
- External MariaDB container as the source database for full load replication
26+
- [DMS Replication Instance](https://docs.localstack.cloud/aws/services/dms/) to execute migration tasks
27+
- [DMS Source/Target Endpoints](https://docs.localstack.cloud/aws/services/dms/) connecting to both MariaDB instances
28+
- [DMS Replication Tasks](https://docs.localstack.cloud/aws/services/dms/) for full load and CDC operations
29+
- [Kinesis Data Stream](https://docs.localstack.cloud/aws/services/kinesis/) as the target for replicated data
30+
- [Secrets Manager](https://docs.localstack.cloud/aws/services/secretsmanager/) for secure database credential storage
1831

19-
```bash
20-
export LOCALSTACK_AUTH_TOKEN=<your-auth-token>
21-
docker-compose up
22-
```
32+
## Prerequisites
33+
34+
- [`LOCALSTACK_AUTH_TOKEN`](https://docs.localstack.cloud/getting-started/auth-token/)
35+
- [Python 3.10+](https://www.python.org/downloads/) & `pip`
36+
- [Docker Compose](https://docs.docker.com/compose/install/)
37+
- [CDK](https://docs.localstack.cloud/user-guide/integrations/aws-cdk/) with the [`cdklocal`](https://github.com/localstack/aws-cdk-local) wrapper
38+
- [`make`](https://www.gnu.org/software/make/) (**optional**, but recommended for running the sample application)
2339

24-
The Docker Compose file will start LocalStack Pro container and a MariaDB container. The MariaDB container will be used to showcase how to reach a database external to LocalStack.
40+
## Installation
2541

26-
## Instructions
42+
To run the sample application, you need to install the required dependencies.
43+
44+
First, clone the repository:
45+
46+
```shell
47+
git clone https://github.com/localstack/sample-dms-cdc-rds-to-kinesis.git
48+
```
2749

28-
### Install the dependencies
50+
Then, navigate to the project directory:
51+
52+
```shell
53+
cd sample-dms-cdc-rds-to-kinesis
54+
```
2955

3056
Install all the dependencies by running the following command:
3157

32-
```bash
58+
```shell
3359
make install
3460
```
3561

36-
### Creating the infrastructure
62+
This will create a virtual environment and install the required Python packages including AWS CDK dependencies.
63+
64+
## Deployment
65+
66+
Start LocalStack Pro with the `LOCALSTACK_AUTH_TOKEN` pre-configured:
67+
68+
```shell
69+
export LOCALSTACK_AUTH_TOKEN=<your-auth-token>
70+
make start
71+
```
72+
73+
The Docker Compose file will start LocalStack Pro container and a MariaDB container that will serve as the external source database.
3774

38-
To deploy the infrastructure, you can run the following command:
75+
To deploy the sample application infrastructure, run the following command:
3976

40-
```bash
77+
```shell
4178
make deploy
4279
```
4380

4481
After successful deployment, you will see the following output:
4582

46-
```bash
83+
```shell
4784
Outputs:
4885
DMsSampleSetupStack.cdcTask1 = arn:aws:dms:us-east-1:000000000000:task:A001NYMR4Z0NK45ZBJT6954RNMGEKL2PQ9XQYR4
4986
DMsSampleSetupStack.cdcTask2 = arn:aws:dms:us-east-1:000000000000:task:GO5RC4J6CKZWSJKF4CGB6ZV3ZEMGI38DFPJF2ZU
@@ -58,53 +95,76 @@ arn:aws:cloudformation:us-east-1:000000000000:stack/DMsSampleSetupStack/b8298866
5895
✨ Total time: 49.33s
5996
```
6097

61-
### Running the tasks
98+
## Testing
6299

63-
You can run the tasks by executing the following command:
100+
You can run the replication tasks and validate the data pipeline by executing the following command:
64101

65-
```bash
102+
```shell
66103
make run
67104
```
68105

69-
## Developer Notes
106+
This will execute the complete test scenario including:
70107

71-
Four tasks are deployed with the stack, split into two parts.
108+
- Creating test tables and inserting initial data
109+
- Starting full load replication tasks
110+
- Monitoring Kinesis stream for replicated events
111+
- Starting CDC replication tasks
112+
- Performing additional data changes to trigger CDC
113+
- Logging table statistics and replication progress
72114

73-
First, a full load replication task runs against the external DB:
115+
The test validates both full load and CDC replication patterns, demonstrating how DMS captures and streams database changes to Kinesis in real-time.
74116

75-
- Creates three tables: `authors`, `accounts`, `novels`
76-
- Makes four inserts
77-
- Starts full load task 1 targeting tables starting with 'a' (`a%` table mapping)
78-
- Captures and logs six Kinesis events: 2 drop tables, 2 create tables, 2 inserts
79-
- Starts full load task 2 targeting the `novels` table (`novels` table mapping)
80-
- Captures and logs four Kinesis events: 1 drop table, 1 create table, 2 inserts
81-
- Logs `table_statistics` for both tasks
117+
## Use Cases
82118

83-
Next, a CDC replication task runs against the RDS database:
119+
### Full Load Replication
84120

85-
- Creates three tables: `authors`, `accounts`, `novels`
86-
- Starts CDC task 1 targeting tables starting with 'a' (`a%` table mapping)
87-
- Starts CDC task 2 targeting the `novels` table (`novels` table mapping)
88-
- Captures and logs five Kinesis events: 2 for `awsdms_apply_exceptions` table, 3 for our tables
89-
- Makes four inserts
90-
- Captures and logs four Kinesis events: 2 for tables in task 1, 2 for table in task 2
91-
- Makes three table alterations, one per table
92-
- Captures and logs three Kinesis events
93-
- Logs `table_statistics` for both tasks
121+
This sample demonstrates full load replication tasks against an external MariaDB database running in Docker. The full load scenario showcases initial data migration and bulk data transfer patterns.
94122

95-
Two tasks perform full load replication on Dockerized MariaDB. The other two perform CDC replication on a MariaDB RDS database.
123+
The full load replication workflow includes:
96124

97-
All tasks target the same Kinesis Stream.
125+
- Creating three tables: `authors`, `accounts`, `novels` with sample data
126+
- Starting full load task 1 targeting tables starting with 'a' (`a%` table mapping)
127+
- Starting full load task 2 targeting the `novels` table (specific table mapping)
128+
- Capturing Kinesis events for table operations: drop tables, create tables, and data inserts
129+
- Monitoring table statistics and replication progress for both tasks
130+
- Demonstrating selective table replication using different mapping rules
98131

99-
## Deploying on AWS
132+
This pattern is ideal for initial database migrations where you need to transfer existing data from on-premises or external databases to AWS-managed services.
100133

101-
You can deploy and run the stack on AWS by running the following commands:
134+
### Change Data Capture (CDC)
102135

103-
```bash
104-
make deploy-aws
105-
make run-aws
106-
```
136+
The CDC replication tasks demonstrate real-time change capture from a MariaDB RDS instance, streaming ongoing database changes to Kinesis as they occur.
137+
138+
The CDC replication workflow includes:
139+
140+
- Creating three tables: `authors`, `accounts`, `novels` in the RDS database
141+
- Starting CDC task 1 targeting tables starting with 'a' (`a%` table mapping)
142+
- Starting CDC task 2 targeting the `novels` table (specific table mapping)
143+
- Capturing real-time changes: INSERT, UPDATE, and DELETE operations
144+
- Performing table alterations and schema changes during active replication
145+
- Streaming all changes to the same Kinesis Data Stream for downstream processing
146+
- Monitoring replication lag and table statistics for ongoing operations
147+
148+
This pattern enables building event-driven architectures and real-time analytics pipelines that respond to database changes as they happen.
149+
150+
## Summary
151+
152+
This sample application demonstrates how to build, deploy, and test a complete database migration and replication pipeline using AWS DMS and related services. It showcases the following patterns:
153+
154+
- Deploying DMS infrastructure using AWS CDK with Python
155+
- Configuring full load and CDC replication tasks for different migration scenarios
156+
- Integrating multiple database sources (RDS and external MariaDB) with streaming targets
157+
- Using Secrets Manager for secure credential management in DMS workflows
158+
- Monitoring data replication through Kinesis stream events and DMS table statistics
159+
- Leveraging LocalStack Pro for cost-effective development and testing of DMS workflows
160+
161+
The application provides a foundation for understanding enterprise database migration patterns and real-time data replication architectures.
107162

108-
## License
163+
## Learn More
109164

110-
This project is licensed under the Apache 2.0 License.
165+
- [LocalStack DMS Documentation](https://docs.localstack.cloud/aws/services/dms/)
166+
- [AWS DMS Best Practices](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_BestPractices.html)
167+
- [Change Data Capture Patterns](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Task.CDC.html)
168+
- [Using AWS CDK with LocalStack](https://docs.localstack.cloud/user-guide/integrations/aws-cdk/)
169+
- [Kinesis Data Streams for Real-time Processing](https://docs.localstack.cloud/aws/services/kinesis/)
170+
- [Database Migration Strategies with DMS](https://aws.amazon.com/dms/resources/)

dms-mariadb-to-kinesis.jpg

-93.3 KB
Binary file not shown.

images/architecture.png

295 KB
Loading

0 commit comments

Comments
 (0)