The rebalance monitoring feature in kfcli allows you to track consumer group rebalancing events in your Kafka cluster. Rebalancing occurs when consumers join or leave a group, or when partition assignments change, and monitoring these events is crucial for understanding consumer group behavior and debugging issues.
Rebalancing is the process by which Kafka redistributes partitions among consumers in a consumer group. This happens when:
- A new consumer joins the group
- A consumer leaves the group (gracefully or due to failure)
- The number of partitions in subscribed topics changes
- A consumer is considered dead (hasn't sent heartbeat within session timeout)
During rebalancing, consumers temporarily stop consuming messages, which can cause processing delays.
View the current state of consumer groups and detect if rebalancing is in progress:
# Show status for all consumer groups
kfcli rebalance status
# Show status for a specific consumer group
kfcli rebalance status --group my-consumer-group
# Show detailed partition assignment information
kfcli rebalance status --detailed
kfcli rebalance status --group my-consumer-group --detailedOutput Includes:
- Consumer group name
- Current state (Stable, PreparingRebalance, CompletingRebalance, etc.)
- Rebalancing indicator (✓ Stable or
⚠️ REBALANCING) - Total number of partitions assigned
- Number of active members
- Partition distribution across members (detailed mode)
- Per-topic partition assignments (detailed mode)
Monitor consumer groups in real-time and get notified when rebalancing occurs:
# Watch all consumer groups
kfcli rebalance watch
# Watch a specific consumer group
kfcli rebalance watch --group my-consumer-group
# Watch with custom polling interval (default: 5 seconds)
kfcli rebalance watch --interval 10
# Watch specific group with custom interval
kfcli rebalance watch --group my-consumer-group --interval 3Watch Mode Features:
- Real-time state change notifications
- Partition redistribution alerts
- Timestamp for each event
- Shows which consumers gained or lost partitions
- Visual indicators (🔄 for state changes, 📊 for distribution changes)
- Runs continuously until stopped (Ctrl+C)
═══════════════════════════════════════════════════════════
Consumer Group: my-consumer-group
Status: ✓ Stable - Stable
Total Partitions: 8
Members: 2
Partition Distribution:
consumer-1: 4 partitions
consumer-2: 4 partitions
═══════════════════════════════════════════════════════════
═══════════════════════════════════════════════════════════
Consumer Group: my-consumer-group
Status: ⚠️ REBALANCING - PreparingRebalance
Total Partitions: 8
Members: 3
═══════════════════════════════════════════════════════════
═══════════════════════════════════════════════════════════
Consumer Group: my-consumer-group
Status: ✓ Stable - Stable
Total Partitions: 8
Members: 2
Member Details:
+----------------------+------------+------------------+---------------------+
| Member ID | Client ID | Host | Assigned Partitions |
+----------------------+------------+------------------+---------------------+
| consumer-1-12345... | consumer-1 | /192.168.1.100 | 4 |
| consumer-2-67890... | consumer-2 | /192.168.1.101 | 4 |
+----------------------+------------+------------------+---------------------+
Partition Distribution:
Topic: orders
+------------+------------+
| Client ID | Partitions |
+------------+------------+
| consumer-1 | 0, 1, 2, 3 |
| consumer-2 | 4, 5, 6, 7 |
+------------+------------+
═══════════════════════════════════════════════════════════
Watching for rebalancing events... (Press Ctrl+C to stop)
[2025-10-10 14:23:15] 🔄 Group 'my-consumer-group': State changed Stable -> PreparingRebalance
⚠️ Rebalancing in progress!
[2025-10-10 14:23:22] 🔄 Group 'my-consumer-group': State changed PreparingRebalance -> Stable
✓ Rebalancing completed
[2025-10-10 14:25:30] 📊 Group 'my-consumer-group': Partition distribution changed
↑ consumer-3: 3 partitions (was 0)
↓ consumer-1: 2 partitions (was 4)
↓ consumer-2: 3 partitions (was 4)
- Stable: Normal operation, consumers are actively consuming messages
- Active: Consumers are connected and functioning normally
- PreparingRebalance: Group coordinator is preparing for rebalance
- CompletingRebalance: Rebalance is finalizing, new assignments being distributed
- Empty: Group has no active members
- Members without partition assignments (total_partitions = 0 but members > 0)
- This indicates consumers are connected but haven't received assignments yet
Check if a consumer group is stuck in rebalancing:
kfcli rebalance status --group problematic-group --detailedWatch for partition redistribution when scaling consumers:
# In one terminal, watch the group
kfcli rebalance watch --group my-group
# In another terminal, start new consumer instances
# Watch mode will show partition redistributionMonitor for unexpected rebalancing that might indicate consumer crashes:
kfcli rebalance watch --interval 3If you see frequent rebalancing, it might indicate:
- Consumer instances crashing
- Network issues
- Session timeout too short
- Max poll interval exceeded
Check if partitions are evenly distributed:
kfcli rebalance status --group my-group --detailedLook for:
- Uneven partition distribution (some consumers with many more partitions)
- Consumers without assignments
- Expected number of active members
Before making changes (scaling up/down, configuration updates):
# 1. Check current status
kfcli rebalance status --group my-group --detailed
# 2. Start watching
kfcli rebalance watch --group my-group
# 3. Make your changes
# 4. Observe rebalancing behavior and verify stable stateThe monitoring system detects rebalancing through multiple indicators:
- State-based Detection: Checks if group state is "PreparingRebalance", "CompletingRebalance", or "Empty"
- Assignment-based Detection: Detects when members exist but have no partition assignments
- Distribution Tracking: Monitors changes in partition distribution across members
- Watch mode polls at configurable intervals (default: 5 seconds)
- Minimum recommended interval: 2 seconds (to avoid overwhelming the broker)
- Maximum recommended interval: 30 seconds (for timely detection)
- Status checks are lightweight metadata operations
{
"group_id": "my-group",
"state": "Stable",
"members": [...],
"total_partitions": 8,
"is_rebalancing": false,
"partition_distribution": {
"consumer-1": 4,
"consumer-2": 4
}
}{
"member_id": "consumer-1-12345...",
"client_id": "consumer-1",
"host": "/192.168.1.100",
"assignments": {
"topic-1": [0, 1, 2],
"topic-2": [0]
}
}Problem: kfcli rebalance status shows "No consumer groups found"
Solutions:
- Verify Kafka cluster is running and accessible
- Check that consumer groups exist:
kfcli consumer --list - Ensure you're connected to the correct Kafka cluster
Problem: Group constantly shows as rebalancing
Possible Causes:
- Session timeout too short: Consumers can't send heartbeats fast enough
- Max poll interval exceeded: Consumer processing takes too long
- Consumer crashes: Check consumer logs for errors
- Network issues: Intermittent connectivity problems
Solutions:
- Increase
session.timeout.ms(default: 10s, try 30s) - Increase
max.poll.interval.ms(default: 5 minutes) - Review consumer logs for exceptions
- Check network stability
Problem: Some consumers have significantly more partitions than others
Causes:
- Consumers joined at different times
- Custom partition assignment strategy
- Number of partitions not divisible by number of consumers
Solutions:
- Wait for next rebalance (will usually even out)
- Use Range or RoundRobin assignment strategy
- Adjust number of partitions or consumers
Problem: Rebalancing happened but wasn't detected in watch mode
Causes:
- Polling interval too long
- Rebalance completed between polls
- Network delay
Solutions:
- Decrease polling interval:
--interval 2 - Check status immediately after suspected rebalance
- Review Kafka broker logs
Set up periodic status checks to catch issues early:
# Add to monitoring scripts
kfcli rebalance status > /var/log/kafka-rebalance-status.logIf a group is rebalancing for more than expected:
# Check every minute, alert if rebalancing > 5 minutes
# (Implement in monitoring system)When rolling out consumer changes:
kfcli rebalance watch --group production-consumers --interval 3Get complete picture:
# Check consumer group details
kfcli consumer --consumer my-group --pending
# Check rebalance status
kfcli rebalance status --group my-group --detailed
# Check topic details
kfcli topics details --topic my-topicRecord normal rebalancing patterns:
- How long rebalancing typically takes
- Expected partition distribution
- Number of members
- Use as baseline for detecting anomalies
- No Historical Storage: Events are only tracked during watch mode execution
- Polling-Based: Not real-time event streaming (depends on polling interval)
- Memory-Based Tracking: State comparison is in-memory only
- No Rebalance Metrics: Duration and frequency not calculated
Planned improvements (not yet implemented):
- Persistent event storage
- Rebalance duration tracking
- Frequency analysis
- Alert thresholds
- JSON output format
- Integration with monitoring systems
For Prometheus monitoring, use metrics command alongside rebalance monitoring:
kfcli metrics --format prometheusExample monitoring script:
#!/bin/bash
# Check for rebalancing and alert if detected
STATUS=$(kfcli rebalance status --group my-group 2>&1)
if echo "$STATUS" | grep -q "REBALANCING"; then
echo "ALERT: Consumer group my-group is rebalancing"
# Send alert to your monitoring system
fiRun watch mode as a background service:
nohup kfcli rebalance watch --group my-group --interval 5 > /var/log/rebalance-watch.log 2>&1 &- Consumer Group Management:
kfcli consumer --help - Cluster Metrics:
kfcli metrics --help - Topic Details:
kfcli topics details --help
For more information and updates, see the main README and TASKS.md files.