Introduce QUIC connection lifecycle tracing and dead connection pruning#50
Conversation
Log close_reason when detecting dead connections on client side and when server-side stream accept loop terminates. Log when server replaces a validator connection during re-handshake.
Investigate systematic QUIC synapse query failures by adding diagnostic tracing to send_synapse_packet and server dispatch paths. Resolve stale connections persisting in the registry by pruning dead connections during update_miner_registry, and report actual connection health status in get_connection_stats.
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughThe changes add comprehensive debug and info-level logging across client and server components to improve observability of connection lifecycle events and query operations. Additionally, dead connection pruning logic is introduced in the miner registry update path, and new integration tests verify query execution under various registry update and connectivity scenarios. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/btlightning/tests/integration.rs (1)
1993-2023: Replace fixed delay with condition-based waiting in this test.Line 2013 uses an unconditional 5s sleep, which adds avoidable test latency and can still be timing-sensitive. Prefer a bounded poll loop that exits as soon as the condition is met.
♻️ Proposed test refactor
- tokio::time::sleep(Duration::from_secs(5)).await; + tokio::time::timeout(Duration::from_secs(5), async { + loop { + let stats = client.get_connection_stats().await.unwrap(); + if stats.get("total_connections").is_some_and(|v| v == "1") { + break; + } + tokio::time::sleep(Duration::from_millis(50)).await; + } + }) + .await + .expect("connection should become active within 5s");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/btlightning/tests/integration.rs` around lines 1993 - 2023, In the test query_works_via_update_miner_registry_after_delay replace the unconditional tokio::time::sleep(Duration::from_secs(5)).await with a bounded poll loop that repeatedly attempts the condition (e.g., call client.query_axon_with_timeout(axon.clone(), build_request("echo"), short_timeout) or another lightweight probe) until it succeeds or a total timeout elapses; use Instant::now() + Duration::from_secs(N) as the deadline, sleep a short interval (e.g., 50–200ms) between attempts, and fail the test if the deadline is reached—this keeps the call to update_miner_registry and the axon/query flow (update_miner_registry, query_axon_with_timeout, build_request, Duration) but removes the fixed 5s delay.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/btlightning/tests/integration.rs`:
- Around line 1993-2023: In the test
query_works_via_update_miner_registry_after_delay replace the unconditional
tokio::time::sleep(Duration::from_secs(5)).await with a bounded poll loop that
repeatedly attempts the condition (e.g., call
client.query_axon_with_timeout(axon.clone(), build_request("echo"),
short_timeout) or another lightweight probe) until it succeeds or a total
timeout elapses; use Instant::now() + Duration::from_secs(N) as the deadline,
sleep a short interval (e.g., 50–200ms) between attempts, and fail the test if
the deadline is reached—this keeps the call to update_miner_registry and the
axon/query flow (update_miner_registry, query_axon_with_timeout, build_request,
Duration) but removes the fixed 5s delay.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3c50849b-c81f-4994-b44e-afb27fcfe107
📒 Files selected for processing (4)
crates/btlightning/src/client.rscrates/btlightning/src/server/dispatch.rscrates/btlightning/src/server/handshake.rscrates/btlightning/tests/integration.rs
Eliminate field_reassign_with_default in client config tests using struct update syntax. Replace unnecessary_get_then_check in registry with contains_key. Reorder signing.rs to place feature-gated items before test module. Replace unconditional 5s sleep in integration test with bounded poll loop that retries at 250ms intervals with a 10s deadline.
760d9cf to
09fabda
Compare
Summary
update_miner_registry_innerby checkingclose_reason()before retainingget_connection_stats()by checkingclose_reason()instead of always reporting "active"Test plan
cargo test— 4 new integration tests passHttpOnlyfor unreachable minersSummary by CodeRabbit
Bug Fixes
Tests
Chores