Skip to content

feat: add velo-transport grpc impl#7396

Open
ryanolson wants to merge 3 commits intomainfrom
ryan/velo-grpc
Open

feat: add velo-transport grpc impl#7396
ryanolson wants to merge 3 commits intomainfrom
ryan/velo-grpc

Conversation

@ryanolson
Copy link
Contributor

@ryanolson ryanolson commented Mar 15, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added gRPC transport support enabling bidirectional streaming communication with built-in connection management, health checks, and graceful shutdown handling.

Signed-off-by: Ryan Olson <rolson@nvidia.com>
@ryanolson ryanolson requested a review from a team as a code owner March 15, 2026 07:37
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ryanolson
Copy link
Contributor Author

/ok to test 3cda883

@ryanolson ryanolson requested a review from cheese-head March 15, 2026 07:38
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 15, 2026

Walkthrough

A new gRPC transport implementation was introduced alongside supporting infrastructure, including server and client components, a utility address resolver, and comprehensive test refactoring with parameterized test macros for cross-transport coverage.

Changes

Cohort / File(s) Summary
Build Configuration
lib/velo-transports/Cargo.toml, lib/velo-transports/build.rs
Added paste dev-dependency and updated gRPC protobuf compilation to use tonic_build::configure() with explicit bytes mapping for FramedData.
Core gRPC Implementation
lib/velo-transports/src/grpc/mod.rs, lib/velo-transports/src/grpc/server.rs, lib/velo-transports/src/grpc/transport.rs
Introduced new gRPC transport module with bidirectional streaming server (VeloStreamingService), full client-side transport (GrpcTransport), and builder pattern with connection management, health checks, and lifecycle control.
Library Module Setup
lib/velo-transports/src/lib.rs
Enabled gRPC module behind feature flag and added new public utils module.
Utility Functions
lib/velo-transports/src/utils/mod.rs
Added resolve_advertise_address() function to map wildcard bind addresses to routable addresses for advertisement, with IPv4/IPv6 support.
Test Infrastructure & Macros
lib/velo-transports/tests/common/mod.rs, lib/velo-transports/tests/common/shutdown_scenarios.rs
Introduced ShutdownTestClient trait and implementations (TcpShutdownClient, UdsShutdownClient, GrpcShutdownClient), factory pattern for transports, and parameterized test helper functions covering drain behavior, lifecycle, and teardown scenarios.
Integration Test Refactoring
lib/velo-transports/tests/tcp_integration.rs, lib/velo-transports/tests/uds_integration.rs, lib/velo-transports/tests/grpc_integration.rs, lib/velo-transports/tests/tcp_shutdown.rs, lib/velo-transports/tests/uds_shutdown.rs
Consolidated individual test functions into macro-driven test generation for TCP, UDS, and gRPC transports, replacing ~560 lines of explicit test code with parameterized macro invocations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A gRPC tunnel we've woven with care,
Streaming frames bidirectional through the air!
Shutdown states tested, connections refined,
Transports now speak in a language aligned—
From TCP to gRPC, the tests now align! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is completely empty with no content provided, missing all required sections from the template (Overview, Details, Where should the reviewer start, Related Issues). Provide a complete pull request description following the template: include an overview, detailed explanation of changes, specific files for review, and any related GitHub issues.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add velo-transport grpc impl' clearly and specifically describes the main change: adding a gRPC implementation to the velo-transport library.
Docstring Coverage ✅ Passed Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can customize the tone of the review comments and chat replies.

Configure the tone_instructions setting to customize the tone of the review comments and chat replies. For example, you can set the tone to Act like a strict teacher, Act like a pirate and more.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/velo-transports/src/grpc/server.rs (2)

74-84: Consider avoiding preamble build failure silently continuing.

When TcpFrameCodec::build_preamble fails (Lines 74-84), the code logs a warning and continues to the next frame. However, the client that sent the Message frame never receives a rejection response—it will timeout waiting for a response.

This is an edge case (preamble build should not fail for valid header lengths), but for completeness, consider whether the connection should be terminated on this internal error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/velo-transports/src/grpc/server.rs` around lines 74 - 84, The code
currently logs and continues when
TcpFrameCodec::build_preamble(MessageType::ShuttingDown,
framed_data.header.len() as u32, 0) fails, which leaves the client waiting;
change the control flow in the loop that handles framed_data in server.rs so
that on Err(e) you either (a) attempt to send an explicit error response to the
client (construct an appropriate failure frame or use an existing error path) or
(b) close the connection/terminate the handling loop—do this by referencing
TcpFrameCodec::build_preamble, MessageType::ShuttingDown, and framed_data to
locate the failure branch and replace the warn! + continue with logic that sends
a rejection or initiates connection shutdown (and ensure any send errors are
logged/handled).

59-117: Spawned task lacks explicit lifecycle tracking.

The background task spawned at Line 59 runs independently with its JoinHandle dropped. While this is consistent with TCP/UDS listener patterns in the codebase and the task will naturally exit when the inbound stream ends, consider whether cancellation via teardown_token should be observed here for consistent shutdown semantics.

Currently, if teardown_token is cancelled, the tonic server stops accepting new connections, but existing stream handlers continue until the client disconnects or a send fails.

Optional: Add cancellation token observation
 tokio::spawn(async move {
-    while let Ok(Some(framed_data)) = inbound.message().await {
+    loop {
+        let framed_data = tokio::select! {
+            _ = shutdown_state.teardown_token().cancelled() => break,
+            res = inbound.message() => match res {
+                Ok(Some(data)) => data,
+                Ok(None) | Err(_) => break,
+            },
+        };
         // ... rest of processing
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/velo-transports/src/grpc/server.rs` around lines 59 - 117, The spawned
background task that loops on inbound.message().await should observe the
teardown_token cancellation so it can exit promptly during shutdown; modify the
tokio::spawn closure (the loop that currently awaits inbound.message().await) to
use tokio::select! between inbound.message().await and
teardown_token.cancelled() (or the project’s equivalent cancellation future),
and on cancellation either break the loop or send a ShuttingDown framed response
(reusing TcpFrameCodec::build_preamble and response_tx) before exiting; update
references inside the loop (the handling around MessageType checks, response_tx
send, and routing to adapter.message_stream/response_stream/event_stream) to run
only when inbound yields a frame.
lib/velo-transports/src/grpc/transport.rs (1)

442-477: Orphaned reader task on connection close.

The reader task spawned at Line 443 has its JoinHandle dropped, so it runs independently. When the main send loop exits (Lines 480-519), the reader task continues until the response stream errors or ends naturally.

This is acceptable behavior since the reader will exit when response_stream.message() returns Ok(None) or an error, but the connection cleanup at Lines 389-401 won't wait for the reader to finish. Consider whether this could cause issues with resource cleanup timing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/velo-transports/src/grpc/transport.rs` around lines 442 - 477, The
spawned reader task created around tokio::spawn for response_stream is orphaned
(its JoinHandle is dropped) and can outlive connection cleanup; capture the
JoinHandle and ensure it is aborted or awaited during shutdown to avoid
resource/timing issues. Modify the spawn site to save the JoinHandle (e.g., let
handle = tokio::spawn(...);) into a field on the adapter or connection struct
(e.g., adapter.reader_handle or connection.reader_handle), and update the
connection cleanup logic to call handle.abort() (and optionally await
handle.await) or join it to ensure the task is stopped before completing
cleanup; refer to symbols response_stream, adapter, tokio::spawn, and the
connection cleanup routine to implement this.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/velo-transports/src/utils/mod.rs`:
- Around line 13-35: The current resolve_advertise_address function converts
wildcard binds (0.0.0.0 / ::) to loopback which causes incorrect advertised
endpoints; change resolve_advertise_address to return the original wildcard
bind_addr when ip().is_unspecified() instead of mapping to LOCALHOST, and update
callers (notably GrpcTransportBuilder) to require and pass an
explicit_advertise_addr for cases where a non-wildcard advertised address is
needed; specifically, remove the 0.0.0.0/:: -> localhost mapping in
resolve_advertise_address and add logic in GrpcTransportBuilder to accept an
explicit_advertise_addr parameter and use it when bind_addr is unspecified.

In `@lib/velo-transports/tests/common/mod.rs`:
- Around line 667-715: Remove the broken and unused gRPC shutdown test by
deleting the GrpcShutdownClient type and its ShutdownTestClient impl (including
connect_and_send_frame and read_one_frame) because it attempts to send raw
TcpFrameCodec frames to a tonic HTTP/2 server; alternatively, if you want gRPC
shutdown coverage implement a proper gRPC client-based test using the real bidi
client (e.g., VeloStreamingClient) and TestTransportHandle::new_grpc to open an
HTTP/2 stream rather than using TcpFrameCodec.

---

Nitpick comments:
In `@lib/velo-transports/src/grpc/server.rs`:
- Around line 74-84: The code currently logs and continues when
TcpFrameCodec::build_preamble(MessageType::ShuttingDown,
framed_data.header.len() as u32, 0) fails, which leaves the client waiting;
change the control flow in the loop that handles framed_data in server.rs so
that on Err(e) you either (a) attempt to send an explicit error response to the
client (construct an appropriate failure frame or use an existing error path) or
(b) close the connection/terminate the handling loop—do this by referencing
TcpFrameCodec::build_preamble, MessageType::ShuttingDown, and framed_data to
locate the failure branch and replace the warn! + continue with logic that sends
a rejection or initiates connection shutdown (and ensure any send errors are
logged/handled).
- Around line 59-117: The spawned background task that loops on
inbound.message().await should observe the teardown_token cancellation so it can
exit promptly during shutdown; modify the tokio::spawn closure (the loop that
currently awaits inbound.message().await) to use tokio::select! between
inbound.message().await and teardown_token.cancelled() (or the project’s
equivalent cancellation future), and on cancellation either break the loop or
send a ShuttingDown framed response (reusing TcpFrameCodec::build_preamble and
response_tx) before exiting; update references inside the loop (the handling
around MessageType checks, response_tx send, and routing to
adapter.message_stream/response_stream/event_stream) to run only when inbound
yields a frame.

In `@lib/velo-transports/src/grpc/transport.rs`:
- Around line 442-477: The spawned reader task created around tokio::spawn for
response_stream is orphaned (its JoinHandle is dropped) and can outlive
connection cleanup; capture the JoinHandle and ensure it is aborted or awaited
during shutdown to avoid resource/timing issues. Modify the spawn site to save
the JoinHandle (e.g., let handle = tokio::spawn(...);) into a field on the
adapter or connection struct (e.g., adapter.reader_handle or
connection.reader_handle), and update the connection cleanup logic to call
handle.abort() (and optionally await handle.await) or join it to ensure the task
is stopped before completing cleanup; refer to symbols response_stream, adapter,
tokio::spawn, and the connection cleanup routine to implement this.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f4f74ccd-6842-457a-ba4c-d3eb28db3f77

📥 Commits

Reviewing files that changed from the base of the PR and between 9e9ca3e and 3cda883.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • lib/velo-transports/Cargo.toml
  • lib/velo-transports/build.rs
  • lib/velo-transports/src/grpc/mod.rs
  • lib/velo-transports/src/grpc/server.rs
  • lib/velo-transports/src/grpc/transport.rs
  • lib/velo-transports/src/lib.rs
  • lib/velo-transports/src/utils/mod.rs
  • lib/velo-transports/tests/common/mod.rs
  • lib/velo-transports/tests/common/shutdown_scenarios.rs
  • lib/velo-transports/tests/grpc_integration.rs
  • lib/velo-transports/tests/tcp_integration.rs
  • lib/velo-transports/tests/tcp_shutdown.rs
  • lib/velo-transports/tests/uds_integration.rs
  • lib/velo-transports/tests/uds_shutdown.rs

Signed-off-by: Ryan Olson <rolson@nvidia.com>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 16, 2026
Signed-off-by: Ryan Olson <rolson@nvidia.com>
@ryanolson
Copy link
Contributor Author

/ok to test 845447d

@github-actions
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation feat size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant