Skip to content

Conversation

@milesial
Copy link
Contributor

@milesial milesial commented Oct 30, 2025

Overview:

See #3630 for context, this MR is a subset of it, focusing on NIXL metadata passing, building on top of #3971.

image 507323054-8f7b5508-af4e-48c0-899e-609ef80b879c

Summary by CodeRabbit

  • New Features

    • Integrated RDMA support for improved media data handling and memory management.
    • Added NIXL metadata integration for enhanced multimedia content processing.
  • Improvements

    • Enhanced serialization support for image data with additional format options.
    • Refined media processing pipeline for better performance and reliability.

@milesial milesial requested a review from a team as a code owner October 30, 2025 16:23
@github-actions github-actions bot added the feat label Oct 30, 2025
@milesial milesial force-pushed the alexandrem/frontend-image-decoding branch from cbe3353 to 5be5da0 Compare October 30, 2025 19:10
@milesial milesial force-pushed the alexandrem/frontend-image-decoding branch 6 times, most recently from 6895b84 to eb75d6f Compare November 3, 2025 19:44
@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 4, 2025

@milesial please fix the merge conflicts

@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from 6a0920b to f19f157 Compare November 4, 2025 00:36
@milesial
Copy link
Contributor Author

milesial commented Nov 4, 2025

@milesial please fix the merge conflicts

Yes just rebased, diff looks nicer now

description = "Dynamo LLM Library"

[features]
default = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change making block manager a default build feature intended for this PR?

CC @nv-anants @jthomson04 @ziqifan617

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because the frontend now uses their abstractions. I've heard there is some modularization coming up for that part but until then I hope it's ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think that'll break things for most people not building inside the container.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I wait for #4012 which is enabled by default if I understand correctly?
Who is building outside the container? At some point we'll need to have things like NIXL available during build anyway if we want to have NIXL features in the frontend

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with that MR, that'll remove the kvbm dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use that v2 memory crate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like CI fails because it can't find NIXL :/ But locally it works

nixl.h: No such file or directory

@milesial milesial force-pushed the alexandrem/frontend-image-decoding branch from 4dbec11 to b888111 Compare November 4, 2025 04:36
@milesial milesial requested a review from a team as a code owner November 4, 2025 04:36
@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from f19f157 to ef274b4 Compare November 4, 2025 08:37
@milesial milesial force-pushed the alexandrem/frontend-image-decoding branch from b888111 to bed5bd0 Compare November 4, 2025 18:03
@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from ef274b4 to f09816e Compare November 4, 2025 18:04
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Nov 4, 2025
@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from a622756 to bb40dae Compare November 4, 2025 22:42
@pull-request-size pull-request-size bot added size/L and removed size/XL labels Nov 4, 2025
@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from bb40dae to 1b1882c Compare November 4, 2025 22:44
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 4, 2025

Walkthrough

This pull request introduces RDMA and NIXL integration for media data handling. A new rdma module is added to manage decoded media with RDMA descriptors and NIXL registration. Type structures are refactored to support serialization, and the media loading pipeline is updated to return RDMA descriptors instead of raw decoded data.

Changes

Cohort / File(s) Summary
Dependency Management
Cargo.toml, lib/llm/Cargo.toml
Added dynamo-memory workspace dependency (v0.6.1). Updated image dependency to include serde features.
RDMA/NIXL Module Foundation
lib/llm/src/preprocessor/media.rs, lib/llm/src/preprocessor/media/rdma.rs
New rdma module added with public exports for RDMA descriptor types and NIXL helpers. New rdma.rs implements DecodedMediaData, RdmaMediaDataDescriptor, NIXL agent creation, and RDMA descriptor conversion logic.
Decoder Type Refactoring
lib/llm/src/preprocessor/media/decoders.rs, lib/llm/src/preprocessor/media/decoders/image.rs
Moved DecodedMediaData struct to rdma module. Added Serialize/Deserialize derives to DecodedMediaMetadata, ImageDecoder, ImageLayout, and ImageMetadata. Updated imports to use RDMA-based paths and changed metadata storage location.
Media Loading Pipeline
lib/llm/src/preprocessor/media/loader.rs, lib/llm/src/preprocessor.rs
fetch_and_decode_media_part return type changed from DecodedMediaData to RdmaMediaDataDescriptor. MediaLoader now integrates NixlAgent and converts decoded data to RDMA descriptors. preprocessor.rs implements actual decoding flow with descriptor population.
Data Model Updates
lib/llm/src/protocols/common/preprocessor.rs
MultimodalData::Decoded variant updated to carry RdmaMediaDataDescriptor instead of placeholder.
Serialization Support
lib/memory/src/nixl.rs
Added public serde exports. NixlDescriptor now derives Serialize and Deserialize.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • RDMA descriptor logic: New rdma.rs module contains non-trivial NIXL integration with storage registration, descriptor lifecycle management via Arc, and metadata encoding that requires careful validation.
  • Type refactoring cascade: DecodedMediaData relocation and serde derive additions propagate across multiple decoder layers; each file's changes should be verified for consistency.
  • Return type changes: fetch_and_decode_media_part signature change alters the media processing contract; verify error handling and descriptor storage lifecycle.
  • Metadata storage restructuring: Movement of metadata from DecodedMediaData.metadata to DecodedMediaData.tensor_info.metadata in image decoding requires tracing correctness through the pipeline.

Poem

🐰 A rabbit hops through RDMA's gleaming gate,
Where NIXL agents dance in serialized state,
Descriptors and tensors now safely reside,
With lifetimes preserved in Arc's loving stride!
Media decoding takes wings, oh so bright,

Pre-merge checks

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The pull request description provides an overview with context links to related PRs (#3630, #3971) and visual diagrams, but lacks structured details and required sections like 'Where should the reviewer start' and 'Related Issues' from the template. Expand the description to include missing template sections: provide explicit 'Where should the reviewer start' guidance (e.g., lib/llm/src/preprocessor/media/rdma.rs), and clearly list related issues using action keywords like 'Relates to #3630' and 'Relates to #3971'.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: decoded media via NIXL' clearly and concisely describes the main change—implementing decoded media handling through NIXL integration—which aligns with the core modifications across the codebase.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
lib/llm/src/preprocessor/media/loader.rs (1)

187-211: Cleanup: Remove duplicate test assertions.

Lines 187-211 duplicate the exact same assertions from lines 169-183 (shape verification). This appears to be unintentional code duplication.

Apply this diff to remove the duplicate assertions:

         mock.assert_async().await;
-
-        assert!(
-            !descriptor.tensor_info.shape.is_empty(),
-            "Shape should not be empty"
-        );
-        assert_eq!(
-            descriptor.tensor_info.shape[0], 1125,
-            "Height should be 1125"
-        );
-        assert_eq!(
-            descriptor.tensor_info.shape[1], 1999,
-            "Width should be 1999"
-        );
-        assert_eq!(
-            descriptor.tensor_info.shape[2], 4,
-            "RGBA channels should be 4"
-        );
-
+        
         assert!(
             descriptor.source_storage.is_some(),
             "Source storage should be present"
lib/llm/src/preprocessor/media/rdma.rs (2)

54-72: Consider preserving error context in NIXL registration.

The map_err on line 58-59 replaces the original error with a generic message. Consider including more context about the storage size or type to aid debugging.

 let registered = nixl::register_with_nixl(source_storage, nixl_agent, None)
-    .map_err(|_| anyhow::anyhow!("Failed to register storage with NIXL"))?;
+    .map_err(|s| anyhow::anyhow!("Failed to register {} bytes of storage with NIXL", s.len()))?;

104-113: Track the NIXL metadata workaround.

The function currently uses get_local_md() as a workaround until the referenced PR #970 is merged. The _storage parameter is unused, and the commented-out code shows the intended partial metadata implementation.

Consider:

  1. Tracking this workaround in an issue linked to NIXL PR #970
  2. Removing the commented-out code if it will be replaced once the PR merges
  3. Adding a comment about the performance impact of sending full metadata vs. partial

Do you want me to create a tracking issue for this workaround?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae4b08a and 1b1882c.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • Cargo.toml (1 hunks)
  • lib/llm/Cargo.toml (3 hunks)
  • lib/llm/src/preprocessor.rs (1 hunks)
  • lib/llm/src/preprocessor/media.rs (1 hunks)
  • lib/llm/src/preprocessor/media/decoders.rs (2 hunks)
  • lib/llm/src/preprocessor/media/decoders/image.rs (7 hunks)
  • lib/llm/src/preprocessor/media/loader.rs (6 hunks)
  • lib/llm/src/preprocessor/media/rdma.rs (1 hunks)
  • lib/llm/src/protocols/common/preprocessor.rs (2 hunks)
  • lib/memory/src/nixl.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-09-29T19:11:14.161Z
Learnt from: nv-anants
Repo: ai-dynamo/dynamo PR: 3290
File: docs/_includes/quick_start_local.rst:13-14
Timestamp: 2025-09-29T19:11:14.161Z
Learning: ai-dynamo package requires --prerelease=allow flag for installation due to sglang dependency requiring RC versions (e.g., sglang[all]==0.5.0rc2).

Applied to files:

  • Cargo.toml
  • lib/llm/Cargo.toml
📚 Learning: 2025-10-17T22:57:12.526Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 3700
File: lib/llm/src/block_manager.rs:19-19
Timestamp: 2025-10-17T22:57:12.526Z
Learning: The dynamo library (ai-dynamo/dynamo repository) is currently developed to run on Linux only, so Linux-specific code and syscalls do not require platform guards.

Applied to files:

  • Cargo.toml
  • lib/llm/Cargo.toml
📚 Learning: 2025-10-17T22:57:25.021Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 3700
File: lib/llm/src/block_manager/numa_allocator.rs:46-64
Timestamp: 2025-10-17T22:57:25.021Z
Learning: The dynamo library (specifically the llm/block_manager module) is developed to run on Linux only, so Linux-specific system calls and APIs can be used without platform guards.

Applied to files:

  • Cargo.toml
📚 Learning: 2025-10-17T22:57:18.773Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 3700
File: lib/llm/src/block_manager/numa_allocator.rs:4-6
Timestamp: 2025-10-17T22:57:18.773Z
Learning: The Dynamo library (lib/llm) is developed to run on Linux only, so platform-specific cfg guards for Linux-only features are not required.

Applied to files:

  • Cargo.toml
📚 Learning: 2025-09-16T19:47:30.312Z
Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

Applied to files:

  • lib/llm/src/preprocessor.rs
  • lib/llm/src/preprocessor/media/loader.rs
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.

Applied to files:

  • lib/llm/Cargo.toml
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.

Applied to files:

  • lib/llm/Cargo.toml
🧬 Code graph analysis (3)
lib/llm/src/preprocessor/media.rs (1)
lib/llm/src/preprocessor/media/rdma.rs (2)
  • get_nixl_agent (115-119)
  • get_nixl_metadata (104-113)
lib/llm/src/preprocessor/media/rdma.rs (3)
lib/memory/src/nixl.rs (3)
  • register_with_nixl (176-203)
  • nixl_descriptor (111-113)
  • storage (134-136)
lib/llm/src/preprocessor/media/loader.rs (1)
  • new (48-66)
lib/memory/src/nixl/agent.rs (1)
  • with_backends (82-112)
lib/llm/src/preprocessor/media/loader.rs (3)
lib/llm/src/preprocessor/media/rdma.rs (1)
  • get_nixl_agent (115-119)
lib/memory/src/nixl.rs (3)
  • storage (134-136)
  • descriptor (62-62)
  • descriptor (121-129)
lib/llm/src/block_manager/storage/nixl.rs (1)
  • is_nixl_registered (181-183)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: trtllm (arm64)
🔇 Additional comments (29)
Cargo.toml (1)

50-50: LGTM! Clean workspace dependency addition.

The dynamo-memory crate is correctly added to workspace dependencies with matching version and proper path.

lib/memory/src/nixl.rs (2)

17-17: LGTM! Enables serialization support for NIXL descriptors.

The public export of serde traits is appropriate for enabling serialization of NixlDescriptor across module boundaries.


28-28: LGTM! Serialization derivations are safe and appropriate.

Adding Serialize and Deserialize to NixlDescriptor is safe since it contains only plain data types (u64, usize, MemType, u64), and this enables passing RDMA descriptors across process or network boundaries.

lib/llm/Cargo.toml (2)

45-45: LGTM! Adds required dependency for NIXL integration.

The dynamo-memory workspace dependency is correctly added to support the new RDMA/NIXL media handling flow.


147-147: LGTM! Enables image metadata serialization.

Adding the serde feature to the image crate is appropriate for serializing image metadata in the RDMA descriptor flow.

lib/llm/src/protocols/common/preprocessor.rs (2)

9-9: LGTM! Imports the RDMA descriptor type.

The import correctly brings in RdmaMediaDataDescriptor for use in the MultimodalData enum.


15-15: LGTM! Replaces placeholder with concrete RDMA descriptor.

The Decoded variant now properly carries RdmaMediaDataDescriptor, completing the RDMA-based media handling integration. This allows multimodal data to be passed with NIXL registration metadata.

lib/llm/src/preprocessor/media.rs (2)

7-7: LGTM! Adds RDMA module.

The new rdma module cleanly separates RDMA-related functionality from the existing media handling code.


12-12: LGTM! Exposes RDMA-related public API.

The public re-exports properly expose the RDMA descriptor types and NIXL helper functions for use by the preprocessor and other modules.

lib/llm/src/preprocessor/media/decoders.rs (3)

5-5: LGTM! Adds serialization support.

The serde import enables serialization for the updated DecodedMediaMetadata type.


8-8: LGTM! Relocates DecodedMediaData to RDMA module.

The import correctly references DecodedMediaData from the new rdma module, maintaining clean separation between decoding and RDMA registration concerns.


33-36: LGTM! Adds serialization to metadata.

Making DecodedMediaMetadata public with Serialize and Deserialize derives enables passing metadata alongside RDMA descriptors. The Copy trait is appropriate given the small size of ImageMetadata.

lib/llm/src/preprocessor/media/loader.rs (4)

12-14: LGTM! Updated imports for RDMA integration.

The imports correctly bring in RDMA-related types and the NIXL agent for media descriptor creation.


58-64: LGTM! Initializes NIXL agent for media registration.

The NixlAgent is properly initialized during MediaLoader construction and will be used to register decoded media with NIXL. Error propagation is appropriate if agent creation fails.


98-120: LGTM! Converts decoded media to RDMA descriptor.

The updated flow correctly:

  1. Decodes the media (line 105)
  2. Converts to RDMA descriptor with NIXL registration (line 119)
  3. Returns the descriptor for downstream use

The NIXL registration ensures the decoded data can be accessed via RDMA.


44-44: Verify NixlAgent thread-safety for concurrent operations - cannot confirm from codebase alone.

NixlAgent is stored directly (not Arc-wrapped) in MediaLoader, but relies on nixl_sys::Agent from an external crate. The struct derives Clone and Debug but has no explicit Send/Sync bounds. While HashSet<String> itself is thread-safe, thread-safety depends entirely on whether nixl_sys::Agent implements Send + Sync. This cannot be verified from the repository. Please confirm:

  1. That nixl_sys::Agent is documented as Send + Sync
  2. How MediaLoader instances are shared across concurrent requests in the preprocessor service
  3. Whether the concurrent usage pattern matches the thread-safety guarantees
lib/llm/src/preprocessor.rs (1)

330-344: NIXL registration cleanup is properly handled via Arc-based resource management.

Verified: RdmaMediaDataDescriptor correctly keeps the registration alive through Arc<nixl::NixlRegistered<SystemStorage>> stored in the source_storage field. On error, unprocessed fetch tasks and any partial descriptors drop naturally, triggering the Arc drop chain and cleanup of the NIXL registration. The design follows proper Rust resource management patterns—no additional changes needed.

lib/llm/src/preprocessor/media/decoders/image.rs (6)

9-13: LGTM! Imports correctly updated for RDMA integration.

The imports now reference DecodedMediaData from the RDMA module and add Serde support for serialization.


17-27: LGTM! Serde derives enable configuration serialization.

Adding Serialize and Deserialize derives allows the decoder configuration to be serialized for transport or storage.


40-53: LGTM! Metadata types correctly derive serialization traits.

The added Serialize and Deserialize derives enable metadata to be transported with RDMA descriptors.


82-88: LGTM! Conversion and metadata placement correctly updated.

The change from into() to try_into() matches the new TryFrom implementation in the RDMA module, and metadata placement correctly follows the new nested structure.


94-164: LGTM! Tests correctly updated for new structure.

Test assertions properly reference the nested tensor_info structure to verify shape and data type.


199-257: LGTM! Remaining tests consistently updated.

All test assertions correctly reference the new nested structure for shape and data type verification.

lib/llm/src/preprocessor/media/rdma.rs (6)

17-20: Limited data type support is acceptable for image-focused PR.

Currently only UINT8 is supported. The TODO at line 75 indicates audio support (f32) is planned for future work.


23-28: LGTM! Clean tensor metadata structure.

The structure appropriately groups shape, data type, and optional metadata together.


31-35: LGTM! Appropriate local representation.

DecodedMediaData correctly holds the decoded storage locally before conversion to the serializable RdmaMediaDataDescriptor.


38-52: LGTM! Well-designed RDMA descriptor with proper lifetime management.

The Arc wrapper on source_storage ensures the registered memory remains valid while the descriptor is alive, which is critical for RDMA safety.


76-99: The unsafe copy operation is memory-safe and all verification concerns are addressed.

SystemStorage uses posix_memalign with 4KB alignment, ensuring proper alignment for the copy. The pointer is valid because the allocation is checked for success, and the storage lifetime is guaranteed since it's owned by the returned DecodedMediaData. The copy_nonoverlapping call is safe: source and destination are from separate allocations (Vec and malloc), the lengths match exactly, and no overlap exists. No changes required.


115-119: Review comment is based on incorrect assumptions about agent creation frequency.

The function creates one agent per MediaLoader instance, not per call. get_nixl_agent() is invoked once during MediaLoader::new(), and the agent is stored and reused throughout the loader's lifetime. The design follows a sound pattern: single agent instance per loader, stored as a struct field, with automatic cleanup when the loader is dropped. The underlying Agent type (from FFI bindings) handles resource cleanup automatically. No evidence of resource leaks or inefficiency was found.

Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
@milesial milesial force-pushed the alexandrem/frontend-image-decoding-nixl branch from b544039 to 7e071ea Compare November 8, 2025 00:58
@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 8, 2025

Pulled in #4198

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` multimodal size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants