Skip to content

Potential Bug: Missing Failed Records During Async Operation #3638

@chickenchickenlove

Description

@chickenchickenlove

In what version(s) of Spring for Apache Kafka are you seeing this issue?

3.3-SNAPSHOT

Describe the bug

From this issue, spring-kafka supports async retry with retry topic.
However, IMHO, spring-kafka has a potential bug described below.

protected void handleAsyncFailure() {
List<FailedRecordTuple<K, V>> copyFailedRecords = new ArrayList<>(this.failedRecords);
this.failedRecords.clear();

We can imagine this scenario. (Thread A is thread in executor for Mono or CompletableFuture)

  1. Main Thread : copy records from failedRecords. In this time, failedRecords.size() is 100. so, Main Thread has 100 failed records to retry.
  2. Thread A : Oops! I encounter an exception during operation. Add this record to failedRecords. then, failedRecords.size() is 101.
  3. Main Thread : clear failedRecords by executing failedRecords.clear().

In this scenario, Main thread has 100 failed records to retry.
But, Main Thread removed 101 failed records.
Therefore, 1 failed record will be missed.

To Reproduce

  • None (It's potential bug)

Expected behavior

The KafkaMessageListenerContainer should not miss any failedRecords during handleAsyncFailure.

Sample

  • None (It's potential bug)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions