Skip to content

Rebalance coordinator should not wait on a failed stream #1572

@Awethon

Description

@Awethon

When a consumer receives an unhandled exception during processing, it keeps working until timeout:

timestamp=2025-09-10T19:26:22.789653Z level=ERROR thread=#zio-fiber-1028656011 message="Error in withStream fiber in runWithGracefulShutdown" cause="java.lang.RuntimeException: Simulated failure

timestamp=2025-09-10T19:13:18.599604Z level=WARN thread=#zio-fiber-1784753688 message="Exceeded deadline waiting for streams (of revoked partitions) to commit the offsets of the records they consumed; the rebalance will continue. This might cause another consumer to process some records again. Revoked partitions: kafka-support-test-topic-0: stream ended, last pulled offset=1, last committed offset=none, not committed" location=zio.kafka.consumer.internal.RebalanceCoordinator.doAwaitStreamCommits.logFinalStreamCompletionStatuses file=RebalanceCoordinator.scala line=125

But since the exception was already returned for a record, there's no option to commit it, and no other records are provided to the consumer.
This prevents a quick recovery (redeployment or restart of a consumer) and also tests the fail-fast behavior of a consumer idle for a long time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions