[ISSUE #6594]🚀Add lite pull consumer with configuration validation and graceful shutdown#6595
[ISSUE #6594]🚀Add lite pull consumer with configuration validation and graceful shutdown#6595rocketmq-rust-bot merged 1 commit intomainfrom
Conversation
…d graceful shutdown
|
🔊@mxsm 🚀Thanks for your contribution🎉! 💡CodeRabbit(AI) will review your code first🔥! Note 🚨The code review suggestions from CodeRabbit are to be used as a reference only, and the PR submitter can decide whether to make changes based on their own judgment. Ultimately, the project management personnel will conduct the final code review💥. |
WalkthroughAdded lifecycle and configuration validation methods ( Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Consumer as DefaultLitePullConsumerImpl
participant Config as Configuration
participant MQClient as MQ Client
participant Rebalance as Rebalance Manager
participant OffsetStore as Offset Store
User->>Consumer: start()
Consumer->>Config: check_config()
Config-->>Consumer: validation passed
Consumer->>MQClient: initialize
MQClient-->>Consumer: initialized
alt MessageModel == Broadcasting
Consumer->>OffsetStore: create LocalFileOffsetStore
else MessageModel == Clustering
Consumer->>OffsetStore: create RemoteBrokerOffsetStore
end
Consumer->>OffsetStore: load offsets
Consumer->>Rebalance: do_rebalance
Rebalance-->>Consumer: rebalanced
Consumer->>Consumer: state = Running
Consumer-->>User: started successfully
User->>Consumer: shutdown()
Consumer->>Consumer: signal shutdown
Consumer->>Consumer: await pull tasks
Consumer->>OffsetStore: persist offsets
OffsetStore-->>Consumer: persisted
Consumer->>MQClient: unregister and shutdown
MQClient-->>Consumer: shutdown complete
Consumer->>Consumer: state = ShutdownAlready
Consumer-->>User: shutdown complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs (1)
515-535:⚠️ Potential issue | 🟠 MajorMultiple
unimplemented!()macros pose runtime panic risks.These trait methods contain
unimplemented!()which will panic if invoked:
Method Risk do_rebalance(line 518)Panics when SubscriptionType::Subscribeis usedpersist_consumer_offset(line 529)Called during shutdown - will panic update_topic_subscribe_info(line 534)May be invoked by MQClientInstanceduring topic updatesConsider either:
- Implementing these methods before merging
- Replacing
unimplemented!()withtodo!()macros for clearer intent- Adding early runtime checks to prevent users from triggering these code paths
Would you like me to help implement these methods or create tracking issues for them?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs` around lines 515 - 535, The methods do_rebalance, persist_consumer_offset, and update_topic_subscribe_info currently use unimplemented!() which will panic at runtime; replace those panics with safe, non-panicking implementations or guarded early-returns: in do_rebalance() (and thus try_rebalance()) check the subscription type (SubscriptionType::Subscribe) and delegate to the actual rebalance implementation or perform a no-op with a debug/warn log instead of panicking; in persist_consumer_offset() make it a safe no-op or persist offsets via the existing commit logic while guarding with any shutdown flag if present so it won’t panic during shutdown; in update_topic_subscribe_info(topic, info) apply the topic updates via the rebalance/topic management path or log and return if MQClientInstance is not ready—do not use unimplemented!(), use todo!() only if you intend an immediate visible placeholder, but prefer safe no-op/delegation to avoid runtime panics.
🧹 Nitpick comments (1)
rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs (1)
448-455: Cumulative task timeout could cause extended shutdown delays.Each pull task gets an individual 5-second timeout. With many message queues assigned, shutdown time could become excessive (e.g., 100 queues × 5s = 500s worst case).
Consider using
tokio::time::timeoutaround the entire drain loop with a global shutdown deadline, or aborting tasks that exceed the timeout:♻️ Suggested improvement with global timeout and task abort
// Wait for all pull tasks to complete (5s timeout) + let shutdown_deadline = tokio::time::Instant::now() + Duration::from_secs(30); let mut handles = self.task_handles.write().await; for (mq, handle) in handles.drain() { - if let Err(e) = tokio::time::timeout(Duration::from_secs(5), handle).await { - warn!("Pull task for {:?} did not finish in time: {}", mq, e); + let remaining = shutdown_deadline.saturating_duration_since(tokio::time::Instant::now()); + if remaining.is_zero() { + handle.abort(); + warn!("Aborting pull task for {:?} - shutdown deadline exceeded", mq); + } else if let Err(e) = tokio::time::timeout(remaining, &handle).await { + handle.abort(); + warn!("Pull task for {:?} did not finish in time: {}", mq, e); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs` around lines 448 - 455, The shutdown loop currently applies tokio::time::timeout per task via self.task_handles.write().await and handles.drain(), which can cumulatively delay shutdown; wrap the entire drain/await sequence in a single global timeout (e.g., call tokio::time::timeout(Duration::from_secs(N), async { for (mq, handle) in handles.drain() { let _ = handle.await; } }) around the drain) or, alternatively, after a short global wait iterate remaining handles and call handle.abort() to force-stop tasks (use the task_handles write lock and the same handle identifiers) so that task_handles, handle.await/join, and mq are all handled under one global deadline rather than N separate 5s timeouts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs`:
- Line 457: The shutdown path calls self.persist_consumer_offset().await but
MQConsumerInner's persist_consumer_offset is currently unimplemented
(unimplemented!("persist_consumer_offset")), which will panic at runtime; either
implement persist_consumer_offset to flush offsets from the consumer's
offset_store (use the existing offset_store field/API to collect and persist
offsets, update any remote broker/store and handle errors) or change the
shutdown call site to conditionally skip or safely await a no-op when
persistence is not available (e.g., replace the direct call with a safe wrapper
like try_persist_consumer_offset or check a flag/Option before awaiting) so
shutdown won't call the unimplemented method; focus changes on the
persist_consumer_offset implementation in the MQConsumerInner impl and the
shutdown caller where self.persist_consumer_offset().await is invoked.
- Around line 410-414: The consumer isn't registered before starting the client
instance and uses redundant unwraps; update the startup sequence in
default_lite_pull_consumer_impl to call client_instance.register_consumer(group,
MQConsumerInnerImpl { ... }).await (mirroring default_mq_push_consumer_impl)
before calling client_instance.start(...). Also simplify the unwrap pattern by
taking a single mutable reference or cloning the client_instance once (reuse the
existing cloned variable) and pass that to start rather than calling
as_mut().cloned().unwrap() and as_mut().unwrap() twice.
---
Outside diff comments:
In
`@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs`:
- Around line 515-535: The methods do_rebalance, persist_consumer_offset, and
update_topic_subscribe_info currently use unimplemented!() which will panic at
runtime; replace those panics with safe, non-panicking implementations or
guarded early-returns: in do_rebalance() (and thus try_rebalance()) check the
subscription type (SubscriptionType::Subscribe) and delegate to the actual
rebalance implementation or perform a no-op with a debug/warn log instead of
panicking; in persist_consumer_offset() make it a safe no-op or persist offsets
via the existing commit logic while guarding with any shutdown flag if present
so it won’t panic during shutdown; in update_topic_subscribe_info(topic, info)
apply the topic updates via the rebalance/topic management path or log and
return if MQClientInstance is not ready—do not use unimplemented!(), use todo!()
only if you intend an immediate visible placeholder, but prefer safe
no-op/delegation to avoid runtime panics.
---
Nitpick comments:
In
`@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs`:
- Around line 448-455: The shutdown loop currently applies tokio::time::timeout
per task via self.task_handles.write().await and handles.drain(), which can
cumulatively delay shutdown; wrap the entire drain/await sequence in a single
global timeout (e.g., call tokio::time::timeout(Duration::from_secs(N), async {
for (mq, handle) in handles.drain() { let _ = handle.await; } }) around the
drain) or, alternatively, after a short global wait iterate remaining handles
and call handle.abort() to force-stop tasks (use the task_handles write lock and
the same handle identifiers) so that task_handles, handle.await/join, and mq are
all handled under one global deadline rather than N separate 5s timeouts.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rsrocketmq-client/src/consumer/consumer_impl/re_balance/rebalance_lite_pull_impl.rs
|
|
||
| // Consumer registration integrated through MQClientInstance rebalance mechanism | ||
|
|
||
| let cloned = self.client_instance.as_mut().cloned().unwrap(); | ||
| self.client_instance.as_mut().unwrap().start(cloned).await?; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for register_consumer patterns
echo "=== Searching for register_consumer patterns ==="
rg -i "register.*consumer" --type rust -n -B2 -A2
echo ""
echo "=== Searching for how client_instance is used ==="
rg "client_instance.*start" --type rust -n -B2 -A2
echo ""
echo "=== Checking MQClientInstance structure ==="
fd ".*client_instance.*" --type rust | head -20Repository: mxsm/rocketmq-rust
Length of output: 20664
🏁 Script executed:
#!/bin/bash
# Look at the full context of the start method
echo "=== Examining default_lite_pull_consumer_impl.rs around lines 410-414 ==="
wc -l rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rsRepository: mxsm/rocketmq-rust
Length of output: 217
🏁 Script executed:
#!/bin/bash
# Get more context from the start method
echo "=== Reading the start method context ==="
sed -n '350,435p' rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs | cat -nRepository: mxsm/rocketmq-rust
Length of output: 4664
🏁 Script executed:
#!/bin/bash
# Search for rebalance_impl registration or consumer registration
echo "=== Searching for rebalance setup or consumer group registration ==="
rg "rebalance_impl.*register\|register.*consumer\|consumer_group" rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs -n -B1 -A1Repository: mxsm/rocketmq-rust
Length of output: 130
🏁 Script executed:
#!/bin/bash
# Check MQClientInstance to see what start method expects
echo "=== Checking MQClientInstance definition ==="
fd "client_instance.rs\|client_impl.rs" --type rust | grep -v test
echo ""
echo "=== Looking for MQClientInstance start method signature ==="
rg "fn start\(" --type rust -B2 -A8 | grep -A8 "MQClientInstance"Repository: mxsm/rocketmq-rust
Length of output: 2321
🏁 Script executed:
#!/bin/bash
# Check if rebalance automatically registers consumer
echo "=== Checking RebalanceImpl for consumer registration ==="
rg "register_consumer" rocketmq-client/src/consumer/consumer_impl/re_balance/ -n -B2 -A2
echo ""
echo "=== Checking MQClientInstance start method ==="
sed -n '1,150p' rocketmq-client/src/factory/mq_client_instance.rs | grep -n "pub async fn start" -A20Repository: mxsm/rocketmq-rust
Length of output: 166
🏁 Script executed:
#!/bin/bash
# Compare what happens in default_mq_push_consumer_impl around registration
echo "=== Push consumer registration pattern ==="
sed -n '300,330p' rocketmq-client/src/consumer/consumer_impl/default_mq_push_consumer_impl.rs | cat -nRepository: mxsm/rocketmq-rust
Length of output: 1762
Consumer registration is missing before starting the client instance.
The lite pull consumer should call register_consumer() before start(), consistent with the push consumer implementation. Without explicit registration, the consumer won't be properly tracked by the broker for rebalancing and coordination.
The pattern should match default_mq_push_consumer_impl which calls:
client_instance.register_consumer(group, MQConsumerInnerImpl { ... }).await;
client_instance.start(cloned).await?;
Additionally, the double unwrap pattern on lines 413-414 can be simplified to reduce redundant calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs`
around lines 410 - 414, The consumer isn't registered before starting the client
instance and uses redundant unwraps; update the startup sequence in
default_lite_pull_consumer_impl to call client_instance.register_consumer(group,
MQConsumerInnerImpl { ... }).await (mirroring default_mq_push_consumer_impl)
before calling client_instance.start(...). Also simplify the unwrap pattern by
taking a single mutable reference or cloning the client_instance once (reuse the
existing cloned variable) and pass that to start rather than calling
as_mut().cloned().unwrap() and as_mut().unwrap() twice.
| } | ||
| drop(handles); | ||
|
|
||
| self.persist_consumer_offset().await; |
There was a problem hiding this comment.
Shutdown will panic due to calling unimplemented persist_consumer_offset.
Line 457 calls self.persist_consumer_offset().await, but the MQConsumerInner trait implementation at line 529 contains unimplemented!("persist_consumer_offset"). This will cause a runtime panic during shutdown, preventing graceful termination and potentially losing uncommitted offsets.
🐛 Proposed fix: implement offset persistence or guard the call
Option 1: Implement the method using the stored offset_store:
async fn persist_consumer_offset(&self) {
- // Offset persistence handled by commit operations
- unimplemented!("persist_consumer_offset")
+ if let Some(offset_store) = &self.offset_store {
+ let mqs: Vec<MessageQueue> = self.assigned_message_queue.get_assigned_message_queues();
+ offset_store.persist_all(&mqs).await;
+ }
}Option 2: Guard the shutdown call until implementation is ready:
- self.persist_consumer_offset().await;
+ if let Some(offset_store) = &self.offset_store {
+ // TODO: Implement proper offset persistence
+ let _ = offset_store;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.persist_consumer_offset().await; | |
| if let Some(offset_store) = &self.offset_store { | |
| // TODO: Implement proper offset persistence | |
| let _ = offset_store; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@rocketmq-client/src/consumer/consumer_impl/default_lite_pull_consumer_impl.rs`
at line 457, The shutdown path calls self.persist_consumer_offset().await but
MQConsumerInner's persist_consumer_offset is currently unimplemented
(unimplemented!("persist_consumer_offset")), which will panic at runtime; either
implement persist_consumer_offset to flush offsets from the consumer's
offset_store (use the existing offset_store field/API to collect and persist
offsets, update any remote broker/store and handle errors) or change the
shutdown call site to conditionally skip or safely await a no-op when
persistence is not available (e.g., replace the direct call with a safe wrapper
like try_persist_consumer_offset or check a flag/Option before awaiting) so
shutdown won't call the unimplemented method; focus changes on the
persist_consumer_offset implementation in the MQConsumerInner impl and the
shutdown caller where self.persist_consumer_offset().await is invoked.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6595 +/- ##
==========================================
- Coverage 41.77% 41.74% -0.04%
==========================================
Files 959 959
Lines 133824 133927 +103
==========================================
- Hits 55911 55908 -3
- Misses 77913 78019 +106 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Which Issue(s) This PR Fixes(Closes)
Brief Description
How Did You Test This Change?
Summary by CodeRabbit
New Features
start()andshutdown()lifecycle methods to properly manage DefaultLitePullConsumer initialization and cleanup.Improvements