Commit 297a8c0
committed
Add Mistral3 multimodal support with Pixtral vision encoder
This adds support for Mistral3 multimodal models (vision + text):
- `Bumblebee.Vision.Pixtral`: Pixtral vision encoder with RoPE support
- `Bumblebee.Text.Mistral3`: Mistral3 text decoder with interleaved attention
- `Bumblebee.Multimodal.Mistral3`: Vision-language model combining Pixtral
and Mistral3 with multimodal projector for image-conditioned generation
Supported architectures:
- PixtralVisionModel
- Mistral3Model, Mistral3ForCausalLM, Mistral3ForSequenceClassification
- Mistral3ForConditionalGeneration (multimodal)1 parent bae534a commit 297a8c0
File tree
7 files changed
+1622
-0
lines changed- lib
- bumblebee
- multimodal
- text
- vision
- test/bumblebee
- multimodal
- text
- vision
7 files changed
+1622
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
173 | 179 | | |
174 | 180 | | |
175 | 181 | | |
| |||
198 | 204 | | |
199 | 205 | | |
200 | 206 | | |
| 207 | + | |
201 | 208 | | |
202 | 209 | | |
203 | 210 | | |
| |||
255 | 262 | | |
256 | 263 | | |
257 | 264 | | |
| 265 | + | |
258 | 266 | | |
259 | 267 | | |
260 | 268 | | |
| |||
0 commit comments