Skip to content

Commit 4aa10a6

Browse files
committed
Add dynamoDB streams support for cdc
1 parent b18847b commit 4aa10a6

File tree

11 files changed

+2156
-6
lines changed

11 files changed

+2156
-6
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@ release_notes.md
1010
.task
1111
.vscode
1212
.op
13+
.gomodcache
1314
__pycache__
Lines changed: 323 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
= aws_dynamodb_cdc
2+
:type: input
3+
:status: beta
4+
:categories: ["Services","AWS"]
5+
6+
7+
8+
////
9+
THIS FILE IS AUTOGENERATED!
10+
11+
To make changes, edit the corresponding source file under:
12+
13+
https://github.com/redpanda-data/connect/tree/main/internal/impl/<provider>.
14+
15+
And:
16+
17+
https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl
18+
////
19+
20+
// © 2024 Redpanda Data Inc.
21+
22+
23+
component_type_dropdown::[]
24+
25+
26+
Reads change data capture (CDC) events from DynamoDB Streams
27+
28+
Introduced in version 1.0.0.
29+
30+
31+
[tabs]
32+
======
33+
Common::
34+
+
35+
--
36+
37+
```yml
38+
# Common config fields, showing default values
39+
input:
40+
label: ""
41+
aws_dynamodb_cdc:
42+
table: "" # No default (required)
43+
checkpoint_table: redpanda_dynamodb_checkpoints
44+
start_from: trim_horizon
45+
```
46+
47+
--
48+
Advanced::
49+
+
50+
--
51+
52+
```yml
53+
# All config fields, showing default values
54+
input:
55+
label: ""
56+
aws_dynamodb_cdc:
57+
table: "" # No default (required)
58+
checkpoint_table: redpanda_dynamodb_checkpoints
59+
batch_size: 1000
60+
poll_interval: 1s
61+
start_from: trim_horizon
62+
checkpoint_limit: 1000
63+
max_tracked_shards: 10000
64+
region: "" # No default (optional)
65+
endpoint: "" # No default (optional)
66+
tcp:
67+
connect_timeout: 0s
68+
keep_alive:
69+
idle: 15s
70+
interval: 15s
71+
count: 9
72+
tcp_user_timeout: 0s
73+
credentials:
74+
profile: "" # No default (optional)
75+
id: "" # No default (optional)
76+
secret: "" # No default (optional)
77+
token: "" # No default (optional)
78+
from_ec2_role: false # No default (optional)
79+
role: "" # No default (optional)
80+
role_external_id: "" # No default (optional)
81+
```
82+
83+
--
84+
======
85+
86+
Consumes records from DynamoDB Streams with automatic checkpointing and shard management.
87+
88+
DynamoDB Streams capture item-level changes in DynamoDB tables. This input supports:
89+
- Automatic shard discovery and management
90+
- Checkpoint-based resumption after crashes
91+
- Multiple shard processing
92+
93+
For better performance and longer retention, consider using Kinesis Data Streams for DynamoDB
94+
with the `aws_kinesis` input instead.
95+
96+
97+
== Fields
98+
99+
=== `table`
100+
101+
The name of the DynamoDB table to read streams from.
102+
103+
104+
*Type*: `string`
105+
106+
107+
=== `checkpoint_table`
108+
109+
DynamoDB table name for storing checkpoints. Will be created if it doesn't exist.
110+
111+
112+
*Type*: `string`
113+
114+
*Default*: `"redpanda_dynamodb_checkpoints"`
115+
116+
=== `batch_size`
117+
118+
Maximum number of records to read in a single batch.
119+
120+
121+
*Type*: `int`
122+
123+
*Default*: `1000`
124+
125+
=== `poll_interval`
126+
127+
Time to wait between polling attempts when no records are available.
128+
129+
130+
*Type*: `string`
131+
132+
*Default*: `"1s"`
133+
134+
=== `start_from`
135+
136+
Where to start reading when no checkpoint exists. `trim_horizon` starts from the oldest available record, `latest` starts from new records.
137+
138+
139+
*Type*: `string`
140+
141+
*Default*: `"trim_horizon"`
142+
143+
Options:
144+
`trim_horizon`
145+
, `latest`
146+
.
147+
148+
=== `checkpoint_limit`
149+
150+
Maximum number of messages to process before updating checkpoint.
151+
152+
153+
*Type*: `int`
154+
155+
*Default*: `1000`
156+
157+
=== `max_tracked_shards`
158+
159+
Maximum number of shards to track simultaneously. Prevents memory issues with extremely large tables.
160+
161+
162+
*Type*: `int`
163+
164+
*Default*: `10000`
165+
166+
=== `region`
167+
168+
The AWS region to target.
169+
170+
171+
*Type*: `string`
172+
173+
174+
=== `endpoint`
175+
176+
Allows you to specify a custom endpoint for the AWS API.
177+
178+
179+
*Type*: `string`
180+
181+
182+
=== `tcp`
183+
184+
TCP socket configuration.
185+
186+
187+
*Type*: `object`
188+
189+
190+
=== `tcp.connect_timeout`
191+
192+
Maximum amount of time a dial will wait for a connect to complete. Zero disables.
193+
194+
195+
*Type*: `string`
196+
197+
*Default*: `"0s"`
198+
199+
=== `tcp.keep_alive`
200+
201+
TCP keep-alive probe configuration.
202+
203+
204+
*Type*: `object`
205+
206+
207+
=== `tcp.keep_alive.idle`
208+
209+
Duration the connection must be idle before sending the first keep-alive probe. Zero defaults to 15s. Negative values disable keep-alive probes.
210+
211+
212+
*Type*: `string`
213+
214+
*Default*: `"15s"`
215+
216+
=== `tcp.keep_alive.interval`
217+
218+
Duration between keep-alive probes. Zero defaults to 15s.
219+
220+
221+
*Type*: `string`
222+
223+
*Default*: `"15s"`
224+
225+
=== `tcp.keep_alive.count`
226+
227+
Maximum unanswered keep-alive probes before dropping the connection. Zero defaults to 9.
228+
229+
230+
*Type*: `int`
231+
232+
*Default*: `9`
233+
234+
=== `tcp.tcp_user_timeout`
235+
236+
Maximum time to wait for acknowledgment of transmitted data before killing the connection. Linux-only (kernel 2.6.37+), ignored on other platforms. When enabled, keep_alive.idle must be greater than this value per RFC 5482. Zero disables.
237+
238+
239+
*Type*: `string`
240+
241+
*Default*: `"0s"`
242+
243+
=== `credentials`
244+
245+
Optional manual configuration of AWS credentials to use. More information can be found in xref:guides:cloud/aws.adoc[].
246+
247+
248+
*Type*: `object`
249+
250+
251+
=== `credentials.profile`
252+
253+
A profile from `~/.aws/credentials` to use.
254+
255+
256+
*Type*: `string`
257+
258+
259+
=== `credentials.id`
260+
261+
The ID of credentials to use.
262+
263+
264+
*Type*: `string`
265+
266+
267+
=== `credentials.secret`
268+
269+
The secret for the credentials being used.
270+
[CAUTION]
271+
====
272+
This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info].
273+
====
274+
275+
276+
277+
*Type*: `string`
278+
279+
280+
=== `credentials.token`
281+
282+
The token for the credentials being used, required when using short term credentials.
283+
284+
285+
*Type*: `string`
286+
287+
288+
=== `credentials.from_ec2_role`
289+
290+
Use the credentials of a host EC2 machine configured to assume https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html[an IAM role associated with the instance^].
291+
292+
293+
*Type*: `bool`
294+
295+
Requires version 4.2.0 or newer
296+
297+
=== `credentials.role`
298+
299+
A role ARN to assume.
300+
301+
302+
*Type*: `string`
303+
304+
305+
=== `credentials.role_external_id`
306+
307+
An external ID to provide when assuming a role.
308+
309+
310+
*Type*: `string`
311+
312+
313+
```yml
314+
# Examples
315+
316+
input:
317+
aws_dynamodb_cdc:
318+
table: my_table
319+
region: us-east-1
320+
credentials:
321+
role: arn:aws:iam::123456789012:role/DynamoDBStreamReader
322+
role_external_id: "unique-external-id-12345"
323+
```

go.mod

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ require (
158158
github.com/tetratelabs/wazero v1.9.0
159159
github.com/tigerbeetle/tigerbeetle-go v0.16.61
160160
github.com/timeplus-io/proton-go-driver/v2 v2.1.2
161-
github.com/tmc/langchaingo v0.1.13
161+
github.com/tmc/langchaingo v0.1.14
162162
github.com/trinodb/trino-go-client v0.330.0
163163
github.com/twmb/franz-go v1.20.6
164164
github.com/twmb/franz-go/pkg/kadm v1.17.1
@@ -338,7 +338,7 @@ require (
338338
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.9 // indirect
339339
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
340340
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.9 // indirect
341-
github.com/aws/aws-sdk-go-v2/service/dynamodbstreams v1.31.0 // indirect
341+
github.com/aws/aws-sdk-go-v2/service/dynamodbstreams v1.31.0
342342
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1 // indirect
343343
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.0 // indirect
344344
github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.11.9 // indirect

go.sum

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2165,8 +2165,8 @@ github.com/tklauser/go-sysconf v0.3.15 h1:VE89k0criAymJ/Os65CSn1IXaol+1wrsFHEB8O
21652165
github.com/tklauser/go-sysconf v0.3.15/go.mod h1:Dmjwr6tYFIseJw7a3dRLJfsHAMXZ3nEnL/aZY+0IuI4=
21662166
github.com/tklauser/numcpus v0.10.0 h1:18njr6LDBk1zuna922MgdjQuJFjrdppsZG60sHGfjso=
21672167
github.com/tklauser/numcpus v0.10.0/go.mod h1:BiTKazU708GQTYF4mB+cmlpT2Is1gLk7XVuEeem8LsQ=
2168-
github.com/tmc/langchaingo v0.1.13 h1:rcpMWBIi2y3B90XxfE4Ao8dhCQPVDMaNPnN5cGB1CaA=
2169-
github.com/tmc/langchaingo v0.1.13/go.mod h1:vpQ5NOIhpzxDfTZK9B6tf2GM/MoaHewPWM5KXXGh7hg=
2168+
github.com/tmc/langchaingo v0.1.14 h1:o1qWBPigAIuFvrG6cjTFo0cZPFEZ47ZqpOYMjM15yZc=
2169+
github.com/tmc/langchaingo v0.1.14/go.mod h1:aKKYXYoqhIDEv7WKdpnnCLRaqXic69cX9MnDUk72378=
21702170
github.com/tmthrgd/go-hex v0.0.0-20190904060850-447a3041c3bc h1:9lRDQMhESg+zvGYmW5DyG0UqvY96Bu5QYsTLvCHdrgo=
21712171
github.com/tmthrgd/go-hex v0.0.0-20190904060850-447a3041c3bc/go.mod h1:bciPuU6GHm1iF1pBvUfxfsH0Wmnc2VbpgvbI9ZWuIRs=
21722172
github.com/trinodb/trino-go-client v0.330.0 h1:TBbHjFBuRjYbGtkNyRAJfzLOcwvz8ECihtMtxSzXqOc=
@@ -2273,8 +2273,8 @@ gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638/go.mod h1:EGRJaqe2eO9X
22732273
go.einride.tech/aip v0.73.0 h1:bPo4oqBo2ZQeBKo4ZzLb1kxYXTY1ysJhpvQyfuGzvps=
22742274
go.einride.tech/aip v0.73.0/go.mod h1:Mj7rFbmXEgw0dq1dqJ7JGMvYCZZVxmGOR3S4ZcV5LvQ=
22752275
go.etcd.io/bbolt v1.3.6/go.mod h1:qXsaaIqmgQH0T+OPdb99Bf+PKfBBQVAdyD6TY9G8XM4=
2276-
go.etcd.io/bbolt v1.3.10 h1:+BqfJTcCzTItrop8mq/lbzL8wSGtj94UO/3U31shqG0=
2277-
go.etcd.io/bbolt v1.3.10/go.mod h1:bK3UQLPJZly7IlNmV7uVHJDxfe5aK9Ll93e/74Y9oEQ=
2276+
go.etcd.io/bbolt v1.3.11 h1:yGEzV1wPz2yVCLsD8ZAiGHhHVlczyC9d1rP43/VCRJ0=
2277+
go.etcd.io/bbolt v1.3.11/go.mod h1:dksAq7YMXoljX0xu6VF5DMZGbhYYoLUalEiSySYAS4I=
22782278
go.mongodb.org/mongo-driver v1.11.4/go.mod h1:PTSz5yu21bkT/wXpkS7WR5f0ddqw5quethTUn9WM+2g=
22792279
go.mongodb.org/mongo-driver/v2 v2.3.1 h1:WrCgSzO7dh1/FrePud9dK5fKNZOE97q5EQimGkos7Wo=
22802280
go.mongodb.org/mongo-driver/v2 v2.3.1/go.mod h1:jHeEDJHJq7tm6ZF45Issun9dbogjfnPySb1vXA7EeAI=

0 commit comments

Comments
 (0)