Skip to content

Commit f5d7bb7

Browse files
authored
Update to KCL v3.0.0 (#254)
Update pom and properties for KCL v3.0.0 compatibility
1 parent 3868914 commit f5d7bb7

File tree

4 files changed

+142
-206
lines changed

4 files changed

+142
-206
lines changed

README.md

Lines changed: 33 additions & 196 deletions
Original file line numberDiff line numberDiff line change
@@ -146,202 +146,39 @@ all languages.
146146

147147
## Release Notes
148148

149-
### Release 2.1.5 (May 29, 2024)
150-
* Fixed CI due to different macOS architecture [PR #246](https://github.com/awslabs/amazon-kinesis-client-python/pull/246)
151-
* Added necessary Java SDKs to run sample [PR #248](https://github.com/awslabs/amazon-kinesis-client-python/pull/248)
152-
* Upgraded boto dependency to boto3 [PR #245](https://github.com/awslabs/amazon-kinesis-client-python/pull/245)
153-
* Upgraded AWS SDK from 2.19.2 to 2.25.11 [PR #248](https://github.com/awslabs/amazon-kinesis-client-python/pull/248)
154-
* Upgraded aws-java-sdk from 1.12.370 to 1.12.668 [PR #248](https://github.com/awslabs/amazon-kinesis-client-python/pull/248)
155-
156-
### Release 2.1.4 (April 23, 2024)
157-
* Upgraded KCL and KCL-Multilang dependencies from 2.5.2 to 2.5.8 [PR #239](https://github.com/awslabs/amazon-kinesis-client-python/pull/239)
158-
* Upgraded ion-java from 1.5.1 to 1.11.4 [PR #243](https://github.com/awslabs/amazon-kinesis-client-python/pull/243)
159-
* Upgraded logback version from 1.3.0 to 1.3.12 [PR #242](https://github.com/awslabs/amazon-kinesis-client-python/pull/242)
160-
* Upgraded io.netty dependency from 4.1.86.Final to 4.1.94.Final [PR #234](https://github.com/awslabs/amazon-kinesis-client-python/pull/234)
161-
* Upgraded Google Guava dependency from 32.0.0-jre to 32.1.1-jre [PR #234](https://github.com/awslabs/amazon-kinesis-client-python/pull/234)
162-
* Upgraded jackson-databind from 2.13.4 to 2.13.5 [PR #234](https://github.com/awslabs/amazon-kinesis-client-python/pull/234)
163-
* Upgraded protobuf-java from 3.21.5 to 3.21.7 [PR #234](https://github.com/awslabs/amazon-kinesis-client-python/pull/234)
164-
165-
### Release 2.1.3 (August 8, 2023)
166-
* Added the ability to specify STS endpoint and region [PR #221](https://github.com/awslabs/amazon-kinesis-client-python/pull/230)
167-
* Upgraded KCL and KCL-Multilang Dependencies from 2.5.1 to 2.5.2 [PR #221](https://github.com/awslabs/amazon-kinesis-client-python/pull/230)
168-
169-
### Release 2.1.2 (June 29, 2023)
170-
* Added the ability to pass in streamArn to multilang Daemon [PR #221](https://github.com/awslabs/amazon-kinesis-client-python/pull/221)
171-
* Upgraded KCL and KCL-Multilang Dependencies from 2.4.4 to 2.5.1 [PR #221](https://github.com/awslabs/amazon-kinesis-client-python/pull/221)
172-
* Upgraded Google Guava dependency from 31.0.1-jre to 32.0.0-jre [PR #223](https://github.com/awslabs/amazon-kinesis-client-python/pull/223)
173-
* Added aws-java-sdk-sts dependency [PR #212](https://github.com/awslabs/amazon-kinesis-client-python/pull/212)
174-
175-
### Release 2.1.1 (January 17, 2023)
176-
* Include the pom file in MANIFEST
177-
178-
### Release 2.1.0 (January 12, 2023)
179-
* Upgraded to use version 2.4.4 of the [Amazon Kinesis Client library][kinesis-github]
180-
181-
### Release 2.0.6 (November 23, 2021)
182-
* Upgraded multiple dependencies [PR #152](https://github.com/awslabs/amazon-kinesis-client-python/pull/152)
183-
* Amazon Kinesis Client Library 2.3.9
184-
* ch.qos.logback 1.2.7
185-
186-
### Release 2.0.5 (November 11, 2021)
187-
* Upgraded multiple dependencies [PR #148](https://github.com/awslabs/amazon-kinesis-client-python/pull/148)
188-
* Amazon Kinesis Client Library 2.3.8
189-
* AWS SDK 2.17.52
190-
* Added dependencies
191-
* AWS SDK json-utils 2.17.52
192-
* third-party-jackson-core 2.17.52
193-
* third-party-jackson-dataformat-cbor 2.17.52
194-
* Updated samples/sample.properties reflecting support for InitialPositionInStreamExtended
195-
* Related: [#804](https://github.com/awslabs/amazon-kinesis-client/pull/804) Allowing user to specify an initial timestamp in which daemon will process records.
196-
* Feature released with previous [release 2.0.4](https://github.com/awslabs/amazon-kinesis-client-python/releases/tag/v2.0.4)
197-
198-
### Release 2.0.4 (October 26, 2021)
199-
* Revert/downgrade multiple dependencies as KCL 2.3.7 contains breaking change [PR #145](https://github.com/awslabs/amazon-kinesis-client-python/pull/145)
200-
* Amazon Kinesis Client Library 2.3.6
201-
* AWS SDK 2.16.98
202-
* Upgraded dependencies
203-
* jackson-dataformat-cbor 2.12.4
204-
* AWS SDK 1.12.3
205-
206-
### :warning: [BREAKING CHANGES] Release 2.0.3 (October 21, 2021)
207-
* Upgraded multiple dependencies in [PR #142](https://github.com/awslabs/amazon-kinesis-client-python/pull/142)
208-
* Amazon Kinesis Client Library 2.3.7
209-
* AWS SDK 2.17.52
210-
* AWS Java SDK 1.12.1
211-
* AWS Glue 1.1.5
212-
* Jackson 2.12.4
213-
* io.netty 4.1.68.Final
214-
* guava 31.0.1-jre
215-
216-
### Release 2.0.2 (June 4, 2021)
217-
* Upgraded multiple dependencies in [PR #137](https://github.com/awslabs/amazon-kinesis-client-python/pull/137)
218-
* Amazon Kinesis Client Library 2.3.4
219-
* AWS SDK 2.16.75
220-
* AWS Java SDK 1.11.1031
221-
* Amazon ion java 1.5.1
222-
* Jackson 2.12.3
223-
* io.netty 4.1.65.Final
224-
* typeface netty 2.0.5
225-
* reactivestreams 1.0.3
226-
* guava 30.1.1-jre
227-
* Error prone annotations 2.7.1
228-
* j2objc annotations 2.7.1
229-
* Animal sniffer annotations 1.20
230-
* slf4j 1.7.30
231-
* protobuf 3.17.1
232-
* Joda time 2.10.10
233-
* Apache httpclient 4.5.13
234-
* Apache httpcore 4.4.14
235-
* commons lang3 3.12.0
236-
* commons logging 1.2
237-
* commons beanutils 1.9.4
238-
* commons codec 1.15
239-
* commons collections4 4.4
240-
* commons io 2.9.0
241-
* jcommander 1.81
242-
* rxjava 2.2.21
243-
* Added Amazon Glue schema registry 1.0.2
244-
245-
### Release 2.0.1 (February 27, 2019)
246-
* Updated to version 2.1.2 of the Amazon Kinesis Client Library for Java.
247-
This update also includes version 2.4.0 of the AWS Java SDK.
248-
* [PR #92](https://github.com/awslabs/amazon-kinesis-client-python/pull/92)
249-
250-
### Release 2.0.0 (January 15, 2019)
251-
* Introducing support for Enhanced Fan-Out
252-
* Updated to version 2.1.0 of the Amazon Kinesis Client for Java
253-
* Version 2.1.0 now defaults to using [`RegisterStreamConsumer` Kinesis API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_RegisterStreamConsumer.html), which provides dedicated throughput compared to `GetRecords`.
254-
* Version 2.1.0 now defaults to using [`SubscribeToShard` Kinesis API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SubscribeToShard.html), which provides lower latencies than `GetRecords`.
255-
* __WARNING: `RegisterStreamConsumer` and `SubscribeToShard` are new APIs, and may require updating any explicit IAM policies__
256-
* For more information about Enhaced Fan-Out and Polling with the KCL check out the [announcement](https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/) and [developer documentation](https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html).
257-
* Introducing version 3 of the `RecordProcessorBase` which supports the new `ShardRecordProcessor` interface
258-
* The `shutdown` method from version 2 has been removed and replaced by `leaseLost` and `shardEnded` methods.
259-
* Introducing `leaseLost` method, which takes `LeaseLostInput` object and is invoked when a lease is lost.
260-
* Introducing `shardEnded` method, which takes `ShardEndedInput` object and is invoked when all records from a split/merge have been processed.
261-
* Updated AWS SDK version to 2.2.0
262-
* MultiLangDaemon now uses logging using logback
263-
* MultiLangDaemon supports custom logback.xml file via the `--log-configuration` option.
264-
* `amazon_kclpy_helper` script supports `--log-configuration` option for command generation.
265-
266-
### Release 1.5.1 (January 2, 2019)
267-
* Updated to version 1.9.3 of the Amazon Kinesis Client Library for Java.
268-
* [PR #87](https://github.com/awslabs/amazon-kinesis-client-python/pull/87)
269-
* Changed to now download jars from Maven using https.
270-
* [PR #87](https://github.com/awslabs/amazon-kinesis-client-python/pull/87)
271-
* Changed to raise exception when downloading from Maven fails.
272-
* [PR #80](https://github.com/awslabs/amazon-kinesis-client-python/pull/80)
273-
274-
### Release 1.5.0 (February 7, 2018)
275-
* Updated to version 1.9.0 of the Amazon Kinesis Client Library for Java
276-
* Version 1.9.0 now uses the [`ListShards` Kinesis API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListShards.html), which provides a higher call rate than `DescribeStream`.
277-
* __WARNING: `ListShards` is a new API, and may require updating any explicit IAM policies__
278-
* [PR #71](https://github.com/awslabs/amazon-kinesis-client-python/pull/71)
279-
280-
### Release 1.4.5 (June 28, 2017)
281-
* Record processors can now be notified, and given a final opportunity to checkpoint, when the KCL is being shutdown.
282-
* [PR #53](https://github.com/awslabs/amazon-kinesis-client-python/pull/53)
283-
* [PR #56](https://github.com/awslabs/amazon-kinesis-client-python/pull/56)
284-
* [PR #57](https://github.com/awslabs/amazon-kinesis-client-python/pull/57)
285-
286-
To use this feature the record processor must implement the `shutdown_requested` operation from the respective processor module.
287-
See [v2/processor.py](https://github.com/awslabs/amazon-kinesis-client-python/blob/master/amazon_kclpy/v2/processor.py#L76) or [kcl.py](https://github.com/awslabs/amazon-kinesis-client-python/blob/master/amazon_kclpy/kcl.py#L223) for the required API.
288-
289-
### Release 1.4.4 (April 7, 2017)
290-
* [PR #47](https://github.com/awslabs/amazon-kinesis-client-python/pull/47): Update to release 1.7.5 of the Amazon Kinesis Client.
291-
* Additionally updated to version 1.11.115 of the AWS Java SDK.
292-
* Fixes [Issue #43](https://github.com/awslabs/amazon-kinesis-client-python/issues/43).
293-
* Fixes [Issue #27](https://github.com/awslabs/amazon-kinesis-client-python/issues/27).
294-
295-
### Release 1.4.3 (January 3, 2017)
296-
* [PR #39](https://github.com/awslabs/amazon-kinesis-client-python/pull/39): Make record objects subscriptable for backwards compatibility.
297-
298-
### Release 1.4.2 (November 21, 2016)
299-
* [PR #35](https://github.com/awslabs/amazon-kinesis-client-python/pull/35): Downloading JAR files now runs correctly.
300-
301-
### Release 1.4.1 (November 18, 2016)
302-
* Installation of the library into a virtual environment on macOS, and Windows now correctly downloads the jar files.
303-
* Fixes [Issue #33](https://github.com/awslabs/amazon-kinesis-client-python/issues/33)
304-
305-
### Release 1.4.0 (November 9, 2016)
306-
* Added a new v2 record processor class that allows access to updated features.
307-
* Record processor initialization
308-
* The initialize method receives an InitializeInput object that provides shard id, and the starting sequence and sub sequence numbers.
309-
* Process records calls
310-
* The process_records calls now receives a ProcessRecordsInput object that, in addition to the records, now includes the millisBehindLatest for the batch of records
311-
* Records are now represented as a Record object that adds new data, and includes some convenience methods
312-
* Adds a `binary_data` method that handles the base 64 decode of the data.
313-
* Includes the sub sequence number of the record.
314-
* Includes the approximate arrival time stamp of the record.
315-
* Record processor shutdown
316-
* The method `shutdown` now receives a `ShutdownInput` object.
317-
* Checkpoint methods now accept a sub sequence number in addition to the sequence number.
318-
319-
### Release 1.3.1
320-
* Version number increase to stay inline with PyPI.
321-
322-
### Release 1.3.0
323-
* Updated dependency to Amazon KCL version 1.6.4
324-
325-
### Release 1.2.0
326-
* Updated dependency to Amazon KCL version 1.6.1
327-
328-
### Release 1.1.0 (January 27, 2015)
329-
* **Python 3 support** All Python files are compatible with Python 3
330-
331-
### Release 1.0.0 (October 21, 2014)
332-
* **amazon_kclpy** module exposes an interface to allow implementation of record processor executables that are compatible with the MultiLangDaemon
333-
* **samples** module provides a sample putter application using [boto][boto] and a sample processing app using `amazon_kclpy`
334-
335-
[amazon-kinesis-shard]: http://docs.aws.amazon.com/kinesis/latest/dev/key-concepts.html
336-
[amazon-kinesis-docs]: http://aws.amazon.com/documentation/kinesis/
337-
[amazon-kcl]: http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-app.html
338-
[multi-lang-daemon]: https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java
339-
[kinesis]: http://aws.amazon.com/kinesis
340-
[amazon-kinesis-ruby-github]: https://github.com/awslabs/amazon-kinesis-client-ruby
341-
[kinesis-github]: https://github.com/awslabs/amazon-kinesis-client
342-
[boto]: http://boto.readthedocs.org/en/latest/
343-
[DefaultCredentialsProvider]: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/auth/credentials/DefaultCredentialsProvider.html
344-
[kinesis-forum]: http://developer.amazonwebservices.com/connect/forum.jspa?forumID=169
149+
### Release 3.0.0 (November 6, 2024)
150+
* New lease assignment / load balancing algorithm
151+
* KCL 3.x introduces a new lease assignment and load balancing algorithm. It assigns leases among workers based on worker utilization metrics and throughput on each lease, replacing the previous lease count-based lease assignment algorithm.
152+
* When KCL detects higher variance in CPU utilization among workers, it proactively reassigns leases from over-utilized workers to under-utilized workers for even load balancing. This ensures even CPU utilization across workers and removes the need to over-provision the stream processing compute hosts.
153+
* Optimized DynamoDB RCU usage
154+
* KCL 3.x optimizes DynamoDB read capacity unit (RCU) usage on the lease table by implementing a global secondary index with leaseOwner as the partition key. This index mirrors the leaseKey attribute from the base lease table, allowing workers to efficiently discover their assigned leases by querying the index instead of scanning the entire table.
155+
* This approach significantly reduces read operations compared to earlier KCL versions, where workers performed full table scans, resulting in higher RCU consumption.
156+
* Graceful lease handoff
157+
* KCL 3.x introduces a feature called "graceful lease handoff" to minimize data reprocessing during lease reassignments. Graceful lease handoff allows the current worker to complete checkpointing of processed records before transferring the lease to another worker. For graceful lease handoff, you should implement checkpointing logic within the existing `shutdownRequested()` method.
158+
* This feature is enabled by default in KCL 3.x, but you can turn off this feature by adjusting the configuration property `isGracefulLeaseHandoffEnabled`.
159+
* While this approach significantly reduces the probability of data reprocessing during lease transfers, it doesn't completely eliminate the possibility. To maintain data integrity and consistency, it's crucial to design your downstream consumer applications to be idempotent. This ensures that the application can handle potential duplicate record processing without adverse effects.
160+
* New DynamoDB metadata management artifacts
161+
* KCL 3.x introduces two new DynamoDB tables for improved lease management:
162+
* Worker metrics table: Records CPU utilization metrics from each worker. KCL uses these metrics for optimal lease assignments, balancing resource utilization across workers. If CPU utilization metric is not available, KCL assigns leases to balance the total sum of shard throughput per worker instead.
163+
* Coordinator state table: Stores internal state information for workers. Used to coordinate in-place migration from KCL 2.x to KCL 3.x and leader election among workers.
164+
* Follow this [documentation](https://docs.aws.amazon.com/streams/latest/dev/kcl-migration-from-2-3.html#kcl-migration-from-2-3-IAM-permissions) to add required IAM permissions for your KCL application.
165+
* Other improvements and changes
166+
* Dependency on the AWS SDK for Java 1.x has been fully removed.
167+
* The Glue Schema Registry integration functionality no longer depends on AWS SDK for Java 1.x. Previously, it required this as a transient dependency.
168+
* Multilangdaemon has been upgraded to use AWS SDK for Java 2.x. It no longer depends on AWS SDK for Java 1.x.
169+
* `idleTimeBetweenReadsInMillis` (PollingConfig) now has a minimum default value of 200.
170+
* This polling configuration property determines the [publishers](https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client/src/main/java/software/amazon/kinesis/retrieval/polling/PrefetchRecordsPublisher.java) wait time between GetRecords calls in both success and failure cases. Previously, setting this value below 200 caused unnecessary throttling. This is because Amazon Kinesis Data Streams supports up to five read transactions per second per shard for shared-throughput consumers.
171+
* Shard lifecycle management is improved to deal with edge cases around shard splits and merges to ensure records continue being processed as expected.
172+
* Migration
173+
* The programming interfaces of KCL 3.x remain identical with KCL 2.x for an easier migration. For detailed migration instructions, please refer to the [Migrate consumers from KCL 2.x to KCL 3.x](https://docs.aws.amazon.com/streams/latest/dev/kcl-migration-from-2-3.html) page in the Amazon Kinesis Data Streams developer guide.
174+
* Configuration properties
175+
* New configuration properties introduced in KCL 3.x are listed in this [doc](https://github.com/awslabs/amazon-kinesis-client/blob/master/docs/kcl-configurations.md#new-configurations-in-kcl-3x).
176+
* Deprecated configuration properties in KCL 3.x are listed in this [doc](https://github.com/awslabs/amazon-kinesis-client/blob/master/docs/kcl-configurations.md#discontinued-configuration-properties-in-kcl-3x). You need to keep the deprecated configuration properties during the migration from any previous KCL version to KCL 3.x.
177+
* Metrics
178+
* New CloudWatch metrics introduced in KCL 3.x are explained in the [Monitor the Kinesis Client Library with Amazon CloudWatch](https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html) in the Amazon Kinesis Data Streams developer guide. The following operations are newly added in KCL 3.x:
179+
* `LeaseAssignmentManager`
180+
* `WorkerMetricStatsReporter`
181+
* `LeaseDiscovery`
345182

346183
## License
347184

pom.xml

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,8 @@
22
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
33
<modelVersion>4.0.0</modelVersion>
44
<properties>
5-
<awssdk.version>2.25.11</awssdk.version>
6-
<aws-java-sdk.version>1.12.668</aws-java-sdk.version>
7-
<kcl.version>2.5.8</kcl.version>
5+
<awssdk.version>2.25.64</awssdk.version>
6+
<kcl.version>3.0.0</kcl.version>
87
<netty.version>4.1.108.Final</netty.version>
98
<netty-reactive.version>2.0.6</netty-reactive.version>
109
<fasterxml-jackson.version>2.13.5</fasterxml-jackson.version>
@@ -31,6 +30,18 @@
3130
<artifactId>dynamodb</artifactId>
3231
<version>${awssdk.version}</version>
3332
</dependency>
33+
<!-- https://mvnrepository.com/artifact/software.amazon.awssdk/dynamodb-enhanced -->
34+
<dependency>
35+
<groupId>software.amazon.awssdk</groupId>
36+
<artifactId>dynamodb-enhanced</artifactId>
37+
<version>${awssdk.version}</version>
38+
</dependency>
39+
<!-- https://mvnrepository.com/artifact/com.amazonaws/dynamodb-lock-client -->
40+
<dependency>
41+
<groupId>com.amazonaws</groupId>
42+
<artifactId>dynamodb-lock-client</artifactId>
43+
<version>1.3.0</version>
44+
</dependency>
3445
<dependency>
3546
<groupId>software.amazon.awssdk</groupId>
3647
<artifactId>cloudwatch</artifactId>

0 commit comments

Comments
 (0)