Skip to content

[Feature Request] Implement batch inference for multiple prompts in single forward pass #680

@FredyRivera-dev

Description

@FredyRivera-dev

Description

I would like to request the implementation of batch inference in LightX2V, allowing multiple prompts to be processed in a single forward pass to improve horizontal scaling and performance.

Motivation

Currently, LightX2V processes individual prompts through the generate() method of LightX2VPipeline. To process multiple prompts, the current approach requires using multiple servers in parallel as shown in post_multi_servers_tv2.py .

Implementing batch inference would provide:

  • Better performance: Reduced overhead by processing multiple prompts in a single forward pass
  • Horizontal scalability: Easier processing of large volumes of requests
  • Resource optimization: Better utilization of GPU memory and compute

Current Behavior

  • LightX2VPipeline.generate() accepts individual parameters: seed, prompt, negative_prompt, etc.
  • Models like WanModel and HunyuanVideo15Model process with batch_size=1
  • The server handles tasks one by one to manage GPU memory effectively

Proposed Solution

  1. Extend pipeline interface: Modify generate() to accept lists of prompts
  2. Batch support in models: Add batch dimension in _infer_cond_uncond() methods
  3. Memory management: Adjust memory handling for larger batches
  4. Maintain compatibility: Preserve current API for individual use

Possible Implementations

# Proposed API
pipe.generate_batch(
    prompts=["prompt1", "prompt2", "prompt3"],
    negative_prompts=["neg1", "neg2", "neg3"],
    seeds=[42, 43, 44],
    save_result_paths=["out1.mp4", "out2.mp4", "out3.mp4"]
)

Expected Impact

  • Significant reduction in processing time for multiple videos
  • Better GPU resource utilization
  • Easier high-volume production deployments

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions