Skip to content

support train qwen2.5-vl-32b eagle3 model#437

Merged
jiapingW merged 3 commits intosgl-project:mainfrom
gerayking:feature/support_qwen2.5_vl_32b
Jan 20, 2026
Merged

support train qwen2.5-vl-32b eagle3 model#437
jiapingW merged 3 commits intosgl-project:mainfrom
gerayking:feature/support_qwen2.5_vl_32b

Conversation

@gerayking
Copy link
Contributor

Motivation

I noticed SpecForge currently supports Qwen2.5-VL-7B via Transformers. I think we should instead add systematic VL support through sglang, rather than a model-specific Transformers integration.

Modifications

  • Dataset preprocessing: Use transformers to generate pixel_values and image_grid_thw (already available).

  • Request packing (SGLang): Wrap the data into an sglang Request. When unpacking/splitting into per-request chunks, segment pixel_values by offset based on image_grid_thw, and ensure compatibility with mRoPE.

  • Draft model: During forward(), align mRoPE behavior with the main model.

  • Initialize mmCache: Set up the multimodal cache during initialization.

Related Issues

Fixes #403

Accuracy Test

TODO

Benchmark & Profiling

image

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@gerayking gerayking force-pushed the feature/support_qwen2.5_vl_32b branch 7 times, most recently from cdd81c3 to b0034f6 Compare January 19, 2026 08:27
@gerayking gerayking force-pushed the feature/support_qwen2.5_vl_32b branch from b0034f6 to c3528ec Compare January 19, 2026 16:07
@gerayking gerayking force-pushed the feature/support_qwen2.5_vl_32b branch from a5463b3 to a8d9343 Compare January 20, 2026 08:12
@jiapingW jiapingW merged commit 9abff47 into sgl-project:main Jan 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support training Qwen2.5-VL-32B eagle model

2 participants