Skip to content

[RFC]: vLLM-Omni Multi-Stage CFG Support #1419

@princepride

Description

@princepride

Motivation.

Currently, in the multi-stage pipeline, Stage-0 (autoregressive) sends only a single conditional KV cache to Stage-1 (diffusion). The BagelPipeline in Stage-1 detects this missing multi-branch context and forcefully disables CFG (setting scales to 1.0), resulting in lower quality images compared to the standalone DiT pipeline which utilizes a 3-branch CFG approach.

To resolve this without polluting the core vLLM-Omni framework (orchestrator, stage workers, and KV transfer managers) with model-specific CFG logic, this feature introduces generic hooks. By completely decoupling CFG prompt generation and KV cache reception from the framework, we allow current (Bagel) and future models to seamlessly adapt to multi-branch inference paradigms without requiring changes to the underlying multi-stage orchestration logic.

Architecture

flowchart TD
    subgraph framework [Framework - Generic]
        Orch[Orchestrator]
        KVMgr[KV Transfer Manager]
    end
    subgraph modelSpecific [Model-Specific - bagel.py]
        ExpandFn["expand_cfg_prompts()"]
        CollectFn["collect_cfg_kv_caches()"]
    end
    subgraph yaml [YAML Config]
        YamlCfg["prompt_expand_func: ...bagel.expand_cfg_prompts\ncfg_kv_collect_func: ...bagel.collect_cfg_kv_caches"]
    end
    YamlCfg -->|"loaded by"| Orch
    Orch -->|"calls"| ExpandFn
    ExpandFn -->|"returns expanded prompts"| Orch
    Orch -->|"submits all prompts"| Stage0[Stage-0 AR]
    Stage0 -->|"KV caches via SHM"| KVMgr
    KVMgr -->|"calls"| CollectFn
    CollectFn -->|"returns organized KVs"| Stage1[Stage-1 DiT / BagelPipeline]
Loading

Proposed Change.

RFC.md

Feedback Period.

No response

CC List.

@hsliuustc0106 @ZJY0516 @natureofnature @nussejzz

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is neededhigh priorityhigh priority issue, needs to be done asap

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions