Skip to content

Conversation

@lucasbru
Copy link
Member

@lucasbru lucasbru commented Dec 18, 2025

Add new developer guide page documenting the broker-driven Streams
Rebalance Protocol, including features, configuration, administration,
and architecture. Update navigation links across developer guide pages
to integrate the new section.

Some of the pagination links didn't look right. Not sure if it matters
since we anyway are replacing this representation

Reviewers: Matthias J. Sax [email protected]

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive documentation for the Streams Rebalance Protocol (KIP-1071), a broker-driven rebalancing system for Kafka Streams applications introduced in Apache Kafka 4.2. The documentation covers features, configuration, administration, and architecture details.

Key changes:

  • New developer guide page documenting the Streams Rebalance Protocol with detailed sections on configuration, administration, architecture, and migration
  • Updated pagination links across multiple developer guide pages to properly integrate the new documentation page into the navigation flow

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
docs/streams/developer-guide/streams-rebalance-protocol.html New comprehensive documentation page covering the Streams Rebalance Protocol, including overview, configuration options, administration tools, architecture details, and migration guidance
docs/streams/developer-guide/processor-api.html Updated pagination to point "Next" link to dsl-topology-naming instead of datatypes, maintaining proper navigation flow
docs/streams/developer-guide/dsl-topology-naming.html Added pagination block to connect to processor-api (previous) and datatypes (next)
docs/streams/developer-guide/datatypes.html Updated pagination to point "Previous" link to dsl-topology-naming instead of processor-api
docs/streams/developer-guide/app-reset-tool.html Updated pagination to point "Next" link to streams-rebalance-protocol instead of upgrade-guide
docs/streams/developer-guide/kafka-streams-group-sh.html Added pagination block to connect to streams-rebalance-protocol (previous) and upgrade-guide (next)
docs/streams/developer-guide/index.html Added link to the new streams-rebalance-protocol page in the table of contents
docs/documentation/streams/developer-guide/streams-rebalance-protocol.html Redirect file that includes the main streams-rebalance-protocol.html to support version-agnostic documentation URLs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


<ul class="simple">
<li>Check that all source topics exist and resolve source topic regular expressions (checking that each resolves to at least one topic).</li>
<li>Check that "copartition groups" are satisfied - that is, all source topics that are supposed to be copartitioned are indeed copartitioned.</li>
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term should be "co-partitioned" (with hyphen) to be consistent with the rest of the Kafka Streams documentation. The codebase consistently uses "co-partitioned" throughout other documentation files.

Copilot uses AI. Check for mistakes.
<h3><a class="toc-backref" href="#id13">Topology Configuration and Validation</a><a class="headerlink" href="#topology-configuration" title="Permalink to this headline"></a></h3>
<p>To assign tasks among streams clients, the group coordinator uses topology metadata that is initialized when a member joins the group and persisted in the consumer offsets topic.</p>

<p>Whenever a member joins the streams group, the first heartbeat request contains metadata of the topology. The metadata describes the topology as a set of subtopologies, each identified by a unique string identifier and containing metadata relevant for creation of internal topics and assignment.</p>
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term should be "sub-topologies" (with hyphen) to be consistent with the rest of the Kafka Streams documentation. While "subtopology" (without hyphen) is used in code contexts like method names, in documentation text describing the topology structure, the hyphenated form "sub-topology/sub-topologies" is consistently used (see docs/streams/tutorial.html:504-505, docs/streams/developer-guide/dsl-topology-naming.html).

Copilot uses AI. Check for mistakes.
@lucasbru
Copy link
Member Author

localhost_8080_dev_documentation_streams_developer-guide_streams-rebalance-protocol html

@mumrah
Copy link
Member

mumrah commented Dec 22, 2025

Hey @lucasbru, we just merged the HTML to Markdown changes. You'll need to re-do these changes with the new system. Sorry for the inconvenience!

@lucasbru
Copy link
Member Author

lucasbru commented Jan 5, 2026

Updated to markdown, @mjsax .

localhost_1313_43_streams_developer-guide_streams-rebalance-protocol_

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mjsax mjsax added the kip Requires or implements a KIP label Jan 9, 2026

* **Interactive Query Support**: IQ operations are compatible with the new streams protocol.

* **New Admin RPC**: The StreamsGroupDescribe RPC provides streams-specific metadata separate from consumer group information, with corresponding access via the [`Admin`](/43/javadoc/org/apache/kafka/clients/admin/Admin.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **New Admin RPC**: The StreamsGroupDescribe RPC provides streams-specific metadata separate from consumer group information, with corresponding access via the [`Admin`](/43/javadoc/org/apache/kafka/clients/admin/Admin.html).
* **New Admin RPC**: The StreamsGroupDescribe RPC provides streams-specific metadata separate from consumer group information, with corresponding access via the [`Admin`](/{version}/javadoc/org/apache/kafka/clients/admin/Admin.html).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cf #21276


* **New Admin RPC**: The StreamsGroupDescribe RPC provides streams-specific metadata separate from consumer group information, with corresponding access via the [`Admin`](/43/javadoc/org/apache/kafka/clients/admin/Admin.html).

* **CLI Integration**: You can list, describe, and delete streams groups via the [kafka-streams-groups.sh](kafka-streams-group-sh.md) script.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **CLI Integration**: You can list, describe, and delete streams groups via the [kafka-streams-groups.sh](kafka-streams-group-sh.md) script.
* **CLI Integration**: You can list, describe, and delete streams groups via the [bin/kafka-streams-groups.sh](kafka-streams-group-sh.md) script.


* **CLI Integration**: You can list, describe, and delete streams groups via the [kafka-streams-groups.sh](kafka-streams-group-sh.md) script.

* **Offline Migration**: After shutting down all members and waiting for their `session.timeout.ms` to expire, a classic group can be converted to a streams group and a streams group can be converted to a classic group. The only thing that will be preserved broker-side are the committed offsets of the application.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to wait for session.timeout.ms? I guess it hold if we don't send a leave group request, what is the default, but it also possible to send leave group request explicitly now https://cwiki.apache.org/confluence/display/KAFKA/KIP-1153%3A+Refactor+Kafka+Streams+CloseOptions+to+Fluent+API+Style.

If you think we want to keep the instructions simple, I am also ok to leave it as is. Just wanted to ask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that will be preserved broker-side are the committed offsets of the application.

Could this be misleading with regard to internal topics?


* **Faster, More Stable Rebalances**: Reduces rebalance duration and impact by removing the global synchronization point. This minimizes application downtime during membership changes or failures.

* **Better Observability**: Provides dedicated metrics and admin interfaces that separate streams from consumer groups, leading to clearer troubleshooting with broker-side observability. See the [streams groups metrics]({{< relref "/43/operations/monitoring#group-coordinator-monitoring" >}}) documentation for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **Better Observability**: Provides dedicated metrics and admin interfaces that separate streams from consumer groups, leading to clearer troubleshooting with broker-side observability. See the [streams groups metrics]({{< relref "/43/operations/monitoring#group-coordinator-monitoring" >}}) documentation for details.
* **Better Observability**: Provides dedicated metrics and admin interfaces that separate streams from consumer groups, leading to clearer troubleshooting with broker-side observability. See the [streams groups metrics]({{< relref "/{version}/operations/monitoring#group-coordinator-monitoring" >}}) documentation for details.


### Broker Configuration

The protocol is enabled by default on new Apache Kafka 4.2 clusters. To enable the feature on existing clusters or to explicitly control it:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The protocol is enabled by default on new Apache Kafka 4.2 clusters. To enable the feature on existing clusters or to explicitly control it:
The protocol is enabled by default on new Apache Kafka 4.2 clusters. To enable the feature on existing clusters (after upgrading to 4.2) or to explicitly control it:


### Admin API

Use the "streams groups" methods of the [`Admin`](/43/javadoc/org/apache/kafka/clients/admin/Admin.html) interface to manage streams groups programmatically. These APIs are mostly backed by the same implementations as the consumer group API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use the "streams groups" methods of the [`Admin`](/43/javadoc/org/apache/kafka/clients/admin/Admin.html) interface to manage streams groups programmatically. These APIs are mostly backed by the same implementations as the consumer group API.
Use the "streams groups" methods of the [`Admin`](/{version}/javadoc/org/apache/kafka/clients/admin/Admin.html) interface to manage streams groups programmatically. These APIs are mostly backed by the same implementations as the consumer group API.


### kafka-streams-groups.sh

A new tool called [kafka-streams-groups.sh](kafka-streams-group-sh.md) is added for working with streams groups. It replaces `kafka-streams-application-reset` for streams groups and can be used to list, describe, and delete streams groups. See the [kafka-streams-groups.sh documentation](kafka-streams-group-sh.md) for detailed usage information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A new tool called [kafka-streams-groups.sh](kafka-streams-group-sh.md) is added for working with streams groups. It replaces `kafka-streams-application-reset` for streams groups and can be used to list, describe, and delete streams groups. See the [kafka-streams-groups.sh documentation](kafka-streams-group-sh.md) for detailed usage information.
A new tool called [bin/kafka-streams-groups.sh](kafka-streams-group-sh.md) is added for working with streams groups. It replaces `bin/kafka-streams-application-reset.sh` for streams groups and can be used to list, describe, and delete streams groups. See the [bin/kafka-streams-groups.sh documentation](kafka-streams-group-sh.md) for detailed usage information.


## Monitoring and Metrics

The existing group metrics are extended to differentiate between streams groups and consumer groups and account for streams group states. For complete details, see the [streams groups metrics]({{< relref "/43/operations/monitoring#group-coordinator-monitoring" >}}) documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The existing group metrics are extended to differentiate between streams groups and consumer groups and account for streams group states. For complete details, see the [streams groups metrics]({{< relref "/43/operations/monitoring#group-coordinator-monitoring" >}}) documentation.
The existing group metrics are extended to differentiate between streams groups and consumer groups and account for streams group states. For complete details, see the [streams groups metrics]({{< relref "/{version}/operations/monitoring#group-coordinator-monitoring" >}}) documentation.

Currently, only offline migration is supported. To migrate a Kafka Streams application from the classic protocol to the streams rebalance protocol:

1. Shut down all application instances.
2. Wait for the `session.timeout.ms` to expire so the group becomes empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above.

3. Update the application configuration to set `group.protocol=streams`.
4. Restart the application instances.

The only broker-side data that will be preserved are the committed offsets of the application. All other group metadata will be recreated when the application starts with the new protocol.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above. Misleading with regard to internal topics?

lucasbru and others added 3 commits January 13, 2026 13:37
Add new developer guide page documenting the broker-driven Streams Rebalance
Protocol, including features, configuration, administration, and architecture.
Update navigation links across developer guide pages to integrate the new section.
@lucasbru
Copy link
Member Author

@mjsax all addressed

@lucasbru
Copy link
Member Author

localhost_1313_43_streams_developer-guide_streams-rebalance-protocol_ (1)

Copy link
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. A few minor follow ups.

Also, we should add something to the top-level "notable changes" section in docs/getting-started/upgrade.md

type: docs
description:
weight: 14
weight: 15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own education: what is this and why does it get changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to order the pages in TOCs and sidebars. I wanted the streams group tool appear below the streams rebalance protocol page.

Copy link
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. LGTM.

@mjsax mjsax added the streams label Jan 16, 2026
@lucasbru lucasbru merged commit c62942d into apache:trunk Jan 19, 2026
24 checks passed
@lucasbru
Copy link
Member Author

cherry-picked to 4.2

lucasbru added a commit that referenced this pull request Jan 19, 2026
…21170)

Add new developer guide page documenting the broker-driven Streams
Rebalance Protocol, including features, configuration, administration,
and architecture. Update navigation links across developer guide pages
to integrate the new section.

Some of the pagination links didn't look right. Not sure if it matters
since we anyway are replacing this representation

Reviewers: Matthias J. Sax <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs kip Requires or implements a KIP streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants