Skip to content

Feature Request: BailingMoeV2 Support (Ling Lite 2.0)Β #15968

@fizzAI

Description

@fizzAI

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Add support for BailingMoeV2ForCausalLM models, like https://huggingface.co/inclusionAI/Ling-mini-2.0 and https://huggingface.co/inclusionAI/Ring-mini-2.0

Motivation

They are promising, small MoE models for lower end GPU and CPU users

Possible Implementation

While there isn't any existing implementation, they made available manual patches for vLLM and SGLang, as well as the Transformers custom code implementation in the HF repo, of course.

It looks like they mostly implemented it as an extension to the original BailingMoE and I think we should be able to do the same here

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions