Commit bee1c0f
committed
Kafka source recover from RebalanceInProgressException
When a consumer group rebalances, the Kafka consumer must commit offsets
before partitions are revoked.
Previously we were using "Eager" rebalance revoke mode from fs2-kafka,
with the following consequences:
1. Partitions were revoked immediately without waiting for the
application to commit offsets. This could lead to duplicates.
2. The commit occasionally failed with a RebalanceInProgressException,
if the application tried to commit offsets after the rebalance had
started.
This change switches to "Graceful" rebalance revoke mode. It works as
follows:
1. Source waits up to session.timeout.ms for the fs2 stream to
finalize. This includes committing outstanding offsets.
2. Rebalancing only proceeds after the fs2 stream has finalized, or
after session.timeout.ms. This reduces the possibility of duplicates
due to re-processing.
3. We catch and ignore the RebalanceInProgressException in case the
downstream application cannot finalize the fs2 stream within the
session timeout. This is needed e.g. for Lake Loader which is slow
to finalize a window.1 parent 6ab723e commit bee1c0f
File tree
1 file changed
+5
-1
lines changed- modules/kafka/src/main/scala/com/snowplowanalytics/snowplow/streams/kafka/source
1 file changed
+5
-1
lines changedmodules/kafka/src/main/scala/com/snowplowanalytics/snowplow/streams/kafka/source/KafkaSource.scala
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
71 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
72 | 75 | | |
73 | 76 | | |
74 | 77 | | |
| |||
145 | 148 | | |
146 | 149 | | |
147 | 150 | | |
| 151 | + | |
148 | 152 | | |
0 commit comments