feat: ec connector handler #5162

ayushag-nv · 2026-01-05T06:16:33Z

Overview:

vLLM Supports Embedding Cache Connector that enables encoder disaggregation. The vLLM encoder encoded the image using mm_hash and use EC Connector to store in the Embedding Cache. In this way, encoder acts as a producer.

PD workers can act as consumer. Provided the same EC Connector config as encoder, they can read the embeddings from the cache and use for multi-modal inference.

In this PR, a vllm specific encoder handler is added which used EC Connector to perform the encoding of image payloads.

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Added support for vLLM-native encoder worker with ECConnector mode integration.
- Introduced new configuration options for encoder worker setup including backend selection, storage path, and consumer mode settings.
- Added encoder endpoint capability for multimodal processing with validation and configuration assembly.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: ayushag <[email protected]>

copy-pr-bot · 2026-01-05T06:16:37Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-01-05T06:21:07Z

Walkthrough

Introduces a new vLLM native encoder worker pathway with ECConnector mode. Adds configuration fields for encoder worker setup, utilities for ECTransferConfig creation and engine ID generation, a new initialization path in main orchestration, a handler class for encoder requests, and data models for encoder request/response serialization.

Changes

Cohort / File(s)	Summary
Configuration & Utilities `components/src/dynamo/vllm/args.py`, `components/src/dynamo/vllm/ec_transfer_utils.py`	Added five new Config fields (vllm_native_encoder_worker, ec_connector_backend, ec_storage_path, ec_extra_config, ec_consumer_mode) to args.py. Created ec_transfer_utils.py with functions to generate engine IDs (encoder, pd) and construct ECTransferConfig with JSON parsing and validation logic.
Core Orchestration `components/src/dynamo/vllm/main.py`	Added async init_vllm_native_encoder method to set up encoder endpoint in ECConnector producer mode, routed startup based on vllm_native_encoder_worker flag, and integrated ECConnector consumer configuration into init_multimodal_worker when ec_consumer_mode is enabled. Imported new utilities and handler.
Handler & Protocol `components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py`, `components/src/dynamo/vllm/multimodal_utils/protocol.py`, `components/src/dynamo/vllm/multimodal_handlers/__init__.py`	Introduced VLLMNativeEncoderWorkerHandler class with image loading, mm_hash computation, and encoder execution; added VLLMNativeEncoderRequest/VLLMNativeEncoderResponse data models; exported handler in module init.py.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A new encoder hops in with connector flair,
Images loaded, hashes computed with care,
Config fields sprouting, utilities align,
ECConnector producer-consumer design shines! 🌟

Pre-merge checks

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is partially complete with an Overview section but Details and Where should the reviewer start sections are empty placeholders.	Complete the Details section with specific changes made, and fill in the Where should the reviewer start section with file recommendations. Update the Related Issues section with an actual issue number or remove the placeholder.
Title check	❓ Inconclusive	The title is vague and does not clearly communicate the main purpose of the changes. While it references 'ec connector handler', it lacks specificity about what functionality is being added.	Expand the title to clearly describe the main feature, such as 'Add vLLM native encoder worker with ECConnector support' to better convey the primary changes.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Fix all issues with AI Agents 🤖

In @components/src/dynamo/vllm/ec_transfer_utils.py:
- Around line 1-115: This file fails Black formatting; run the formatter (e.g.,
black --write components/src/dynamo/vllm/ec_transfer_utils.py or pre-commit run
--all-files) to reformat the file, then stage and commit the changes; check the
functions create_ec_transfer_config, get_encoder_engine_id, and get_pd_engine_id
to confirm formatting issues are resolved before pushing.

🧹 Nitpick comments (9)

components/src/dynamo/vllm/multimodal_utils/protocol.py (2)
153-169: Type inconsistency for modality field between request and response.

VLLMNativeEncoderRequest.modality uses Literal["image", "video", "audio"] while VLLMNativeEncoderResponse.modality uses str. Consider using the same Literal type in the response for consistency and stronger type checking:
 class VLLMNativeEncoderResponse(BaseModel):
     """Response from vLLM-native encoder worker (ECConnector mode)"""

     request_id: str
     mm_hash: str  # vLLM's multimodal hash identifier
-    modality: str  # "image", "video", "audio"
+    modality: Literal["image", "video", "audio"]  # "image", "video", "audio"
     embeddings_shape: Tuple[int, ...]  # Shape of encoded embeddings
     connector_metadata: dict[str, Any]  # ECConnector config info for PD workers
138-141: Consider adding audio_url to MultiModalInput for consistency with the modality Literal.

VLLMNativeEncoderRequest.modality includes "audio" but MultiModalInput only supports image_url and video_url. While the handler documents this as a TODO, adding the field now would make the model forward-compatible:
 class MultiModalInput(BaseModel):
     image_url: Optional[str] = None
     video_url: Optional[str] = None
+    audio_url: Optional[str] = None
components/src/dynamo/vllm/args.py (1)

340-342: Consider ordering vllm_native_encoder_worker before multimodal_encode_worker for clarity.

Both vllm_native_encoder_worker and multimodal_encode_worker use the same component/endpoint ("encoder"/"generate"), but they represent different encoder pathways. The current ordering is fine logically since the mutual exclusivity check prevents both from being active, but grouping similar configurations together could improve readability.
components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py (3)
125-127: Use logger.exception instead of logger.error when re-raising exceptions.

logger.exception automatically includes the stack trace, which is valuable for debugging. This addresses the static analysis hints TRY400.
🔎 Proposed fix
         except Exception as e:
-            logger.error(f"Failed to compute mm_hash: {e}")
+            logger.exception(f"Failed to compute mm_hash: {e}")
             raise

         # ... later ...

         except Exception as e:
-            logger.error(f"Encoder execution failed: {e}")
+            logger.exception(f"Encoder execution failed: {e}")
             raise
Also applies to: 150-152

164-167: Consider aligning connector_metadata keys with Config field names.

The keys "ec_connector" and "storage_path" differ from the Config field names ec_connector_backend and ec_storage_path. For consistency and to avoid confusion for PD workers consuming this metadata:
🔎 Suggested change
         connector_metadata={
-            "ec_connector": self.config.ec_connector_backend,
-            "storage_path": self.config.ec_storage_path
+            "ec_connector_backend": self.config.ec_connector_backend,
+            "ec_storage_path": self.config.ec_storage_path
         }
73-75: Keep context parameter for interface compatibility.

The context parameter is unused (static analysis hint ARG002), but this is intentional as it's part of the handler interface required by the endpoint serving framework. Adding an underscore prefix would suppress the warning while keeping the parameter:
     async def generate(
-        self, request, context
+        self, request, _context
     ) -> AsyncGenerator[str, None]:
components/src/dynamo/vllm/main.py (2)
764-769: Prefix unused variables with underscore.

vllm_config and default_sampling_params are unpacked but never used. Prefix with underscore to indicate intentional non-use and suppress static analysis warnings:
🔎 Proposed fix
     (
         engine_client,
-        vllm_config,
-        default_sampling_params,
+        _vllm_config,
+        _default_sampling_params,
         prometheus_temp_dir,
     ) = setup_vllm_engine(config)
729-729: Use logger.exception for error logging before re-raising.

Replace logger.error with logger.exception to include stack traces:
🔎 Proposed fix
     except Exception as e:
-        logger.error(f"Failed to serve encode worker endpoint: {e}")
+        logger.exception(f"Failed to serve encode worker endpoint: {e}")
         raise

     # ... in init_vllm_native_encoder ...

     except Exception as e:
-        logger.error(f"Failed to serve vLLM-native encoder endpoint: {e}")
+        logger.exception(f"Failed to serve vLLM-native encoder endpoint: {e}")
         raise
Also applies to: 792-793
components/src/dynamo/vllm/ec_transfer_utils.py (1)
52-53: Use exception chaining with raise ... from e.

This preserves the original exception context and improves debugging:
🔎 Proposed fix
         except json.JSONDecodeError as e:
-            raise ValueError(f"Invalid JSON in --ec-extra-config: {e}")
+            raise ValueError(f"Invalid JSON in --ec-extra-config: {e}") from e

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 007c5b6 and 488136d.

📒 Files selected for processing (6)

components/src/dynamo/vllm/args.py
components/src/dynamo/vllm/ec_transfer_utils.py
components/src/dynamo/vllm/main.py
components/src/dynamo/vllm/multimodal_handlers/__init__.py
components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py
components/src/dynamo/vllm/multimodal_utils/protocol.py

🧰 Additional context used

🧠 Learnings (3)

📓 Common learnings

Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 2989
File: lib/llm/src/block_manager/distributed/transfer.rs:6-6
Timestamp: 2025-09-18T21:47:44.143Z
Learning: For PR ai-dynamo/dynamo#2989, the ConnectorTransferBatcher architectural issues will be addressed in a follow-up PR by removing the duplicate batching logic and integrating distributed transfers with the existing TransferBatcher + LocalTransferManager pipeline, rather than adding bounded concurrency primitives like Semaphore.

📚 Learning: 2025-10-28T04:09:48.264Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_handlers/processor_handler.py:66-72
Timestamp: 2025-10-28T04:09:48.264Z
Learning: In components/src/dynamo/vllm/multimodal_handlers/processor_handler.py, the AutoTokenizer.from_pretrained call with trust_remote_code=True is intentional and expected for the vLLM multimodal handler implementation.

Applied to files:

components/src/dynamo/vllm/multimodal_handlers/__init__.py
components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py
components/src/dynamo/vllm/main.py

📚 Learning: 2025-10-28T05:48:37.621Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.

Applied to files:

components/src/dynamo/vllm/main.py

🧬 Code graph analysis (3)

components/src/dynamo/vllm/multimodal_handlers/__init__.py (1)

components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py (1)

VLLMNativeEncoderWorkerHandler (37-183)

components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py (1)

components/src/dynamo/vllm/multimodal_utils/protocol.py (2)

VLLMNativeEncoderRequest (153-159)

VLLMNativeEncoderResponse (162-169)

components/src/dynamo/vllm/multimodal_utils/protocol.py (1)

examples/multimodal/utils/protocol.py (1)

MultiModalInput (147-150)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/5162/merge) by ayushag-nv.

components/src/dynamo/vllm/ec_transfer_utils.py

[error] 1-1: Black formatting changed this file. Run 'pre-commit run --all-files' or 'black --write' to fix code style issues in this file.

components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py

[error] 1-1: Black formatting changed this file. Run 'pre-commit run --all-files' or 'black --write' to fix code style issues in this file.

🪛 Ruff (0.14.10)

components/src/dynamo/vllm/args.py

317-320: Avoid specifying long messages outside the exception class

(TRY003)

323-323: Avoid specifying long messages outside the exception class

(TRY003)

331-334: Avoid specifying long messages outside the exception class

(TRY003)

components/src/dynamo/vllm/ec_transfer_utils.py

53-53: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

53-53: Avoid specifying long messages outside the exception class

(TRY003)

61-64: Avoid specifying long messages outside the exception class

(TRY003)

components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py

74-74: Unused method argument: context

(ARG002)

113-115: Avoid specifying long messages outside the exception class

(TRY003)

126-126: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

151-151: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

182-182: Do not catch blind exception: Exception

(BLE001)

components/src/dynamo/vllm/main.py

729-729: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

766-766: Unpacked variable vllm_config is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

767-767: Unpacked variable default_sampling_params is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

793-793: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (13)

components/src/dynamo/vllm/multimodal_handlers/__init__.py (1)

6-8: LGTM!

The import and export of VLLMNativeEncoderWorkerHandler follows the existing pattern and correctly exposes the new handler through the package's public API.

Also applies to: 19-19

components/src/dynamo/vllm/args.py (3)

70-76: LGTM!

The new Config attributes for ECConnector mode are well-structured with sensible defaults.

202-229: LGTM!

CLI arguments are well-documented with clear help text explaining the purpose of each option. The ECConnector configuration flags provide good flexibility for different storage backends.

325-334: LGTM!

Validation logic correctly enforces that --ec-storage-path is required when using the default ECExampleConnector backend, preventing runtime failures due to missing configuration.
components/src/dynamo/vllm/multimodal_handlers/vllm_native_encoder_handler.py (2)
154-156: Hardcoded embeddings shape may cause issues with non-Llama 3.2 Vision models.

The placeholder shape (1, 576, 4096) is specific to Llama 3.2 Vision. PD workers receiving this metadata may rely on it for memory allocation or validation. Consider adding a warning log or making this more prominent:
         # TODO: Get actual embeddings shape from vLLM instead of hardcoded value
         # For now, using typical Llama 3.2 Vision shape as placeholder
+        logger.warning(
+            "Using hardcoded embeddings_shape placeholder. "
+            "This may be incorrect for non-Llama 3.2 Vision models."
+        )
         embeddings_shape = (1, 576, 4096)
1-184: Fix Black formatting issue.

The pipeline indicates this file needs Black formatting. Run pre-commit run --all-files or black --write to fix formatting.
⛔ Skipped due to learnings
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_handlers/processor_handler.py:66-72
Timestamp: 2025-10-28T04:09:48.264Z
Learning: In components/src/dynamo/vllm/multimodal_handlers/processor_handler.py, the AutoTokenizer.from_pretrained call with trust_remote_code=True is intentional and expected for the vLLM multimodal handler implementation.
components/src/dynamo/vllm/main.py (5)

32-36: LGTM!

Imports for ECTransferConfig utilities and the new handler are correctly added.

Also applies to: 42-42

94-96: LGTM!

The routing logic correctly prioritizes vllm_native_encoder_worker in the initialization flow.

735-796: LGTM on init_vllm_native_encoder structure.

The function follows established patterns from other init functions. The ECConnector producer configuration, vLLM engine setup, and endpoint serving are well-organized with appropriate logging.

817-835: LGTM on ECConnector consumer mode integration.

The consumer mode configuration in init_multimodal_worker is well-integrated and follows the same pattern as the producer configuration in init_vllm_native_encoder.

759-761: No action needed. ec_transfer_config is a valid and supported attribute on vLLM's AsyncEngineArgs. The code correctly instantiates ECTransferConfig (imported from vllm.config) and assigns it to engine_args following the same integration pattern as other vLLM configuration attributes like kv_transfer_config and kv_events_config.

Likely an incorrect or invalid review comment.

components/src/dynamo/vllm/ec_transfer_utils.py (2)

23-76: LGTM on create_ec_transfer_config implementation.

The function properly:

Parses optional JSON extra config

Validates required fields for ECExampleConnector

Provides clear logging of configuration

Returns a properly constructed ECTransferConfig

79-114: LGTM on engine ID generation functions.

The functions provide clear, predictable engine IDs with appropriate logging. The naming convention {namespace}.{component}.{role}.{instance_id} is intuitive for debugging and identification.

components/src/dynamo/vllm/ec_transfer_utils.py

Signed-off-by: ayushag <[email protected]>

…r-example' into ayushag/ec-connector-example

Signed-off-by: ayushag <[email protected]>

components/src/dynamo/vllm/multimodal_handlers/processor_handler.py

components/src/dynamo/vllm/multimodal_handlers/encode_worker_handler.py

components/src/dynamo/vllm/args.py

examples/backends/vllm/launch/agg_multimodal_ec_connector.sh

Signed-off-by: ayushag <[email protected]>

Signed-off-by: Ayush Agarwal <[email protected]>

Signed-off-by: ayushag <[email protected]>

ayushag-nv · 2026-01-07T17:55:47Z

/ok to test 7723a39

feat: ec connector handler

488136d

Signed-off-by: ayushag <[email protected]>

ayushag-nv requested review from a team as code owners January 5, 2026 06:16

pull-request-size bot added the size/L label Jan 5, 2026

ayushag-nv marked this pull request as draft January 5, 2026 06:16

github-actions bot added the feat label Jan 5, 2026

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

components/src/dynamo/vllm/ec_transfer_utils.py Outdated Show resolved Hide resolved

ayushag-nv added 2 commits January 5, 2026 16:42

chore: fmt

86e5a01

Signed-off-by: ayushag <[email protected]>

chore: added ec processor v0

bc37577

Signed-off-by: ayushag <[email protected]>

pull-request-size bot added size/XL and removed size/L labels Jan 5, 2026

ayushag-nv added 2 commits January 5, 2026 23:33

chore: ec processor handler

74ebf72

Signed-off-by: ayushag <[email protected]>

chore: tested agg flow

e15ae2c

Signed-off-by: ayushag <[email protected]>

ayushag-nv marked this pull request as ready for review January 6, 2026 05:08

ayushag-nv requested a review from a team as a code owner January 6, 2026 05:08

ayushag-nv added 5 commits January 5, 2026 21:08

feat: ec connector handler

6257a14

Signed-off-by: ayushag <[email protected]>

chore: fmt

b83a9fc

Signed-off-by: ayushag <[email protected]>

chore: added ec processor v0

0b7b9c0

Signed-off-by: ayushag <[email protected]>

chore: ec processor handler

8b60760

Signed-off-by: ayushag <[email protected]>

chore: tested agg flow

9f6c506

Signed-off-by: ayushag <[email protected]>

ayushag-nv force-pushed the ayushag/ec-connector-example branch from e15ae2c to 9f6c506 Compare January 6, 2026 05:08

ayushag-nv requested a review from GuanLuo January 6, 2026 05:15

ayushag-nv added 2 commits January 6, 2026 05:30

Merge remote-tracking branch 'refs/remotes/origin/ayushag/ec-connecto…

34de22f

…r-example' into ayushag/ec-connector-example

chore: merge files

622ef13

Signed-off-by: ayushag <[email protected]>

GuanLuo reviewed Jan 6, 2026

View reviewed changes

ayushag-nv added 3 commits January 7, 2026 17:36

chore: updated docs

2f8c4c8

Signed-off-by: ayushag <[email protected]>

Merge branch 'main' into ayushag/ec-connector-example

cc37e90

Signed-off-by: Ayush Agarwal <[email protected]>

fix: rebase issues

ddc5b26

Signed-off-by: ayushag <[email protected]>

Merge branch 'main' into ayushag/ec-connector-example

7723a39

ayushag-nv enabled auto-merge (squash) January 7, 2026 18:11

GuanLuo approved these changes Jan 7, 2026

View reviewed changes

ayushag-nv merged commit 85e0512 into main Jan 7, 2026
28 checks passed

ayushag-nv deleted the ayushag/ec-connector-example branch January 7, 2026 19:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: ec connector handler #5162

feat: ec connector handler #5162

Uh oh!

ayushag-nv commented Jan 5, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Jan 5, 2026

Uh oh!

coderabbitai bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ayushag-nv commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: ec connector handler #5162

feat: ec connector handler #5162

Uh oh!

Conversation

ayushag-nv commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Jan 5, 2026

Uh oh!

coderabbitai bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ayushag-nv commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ayushag-nv commented Jan 5, 2026 •

edited

Loading

coderabbitai bot commented Jan 5, 2026 •

edited

Loading