@@ -94,6 +94,7 @@ This document tracks all model weights available in the `/model-weights` directo
9494| Model | Configuration |
9595| :------| :-------------|
9696| ` Llama-4-Scout-17B-16E-Instruct ` | ❌ |
97+ | ` Llama-4-Maverick-17B-128E-Instruct ` | ❌ |
9798
9899### Mistral AI: Mistral
99100| Model | Configuration |
@@ -128,6 +129,7 @@ This document tracks all model weights available in the `/model-weights` directo
128129| :------| :-------------|
129130| ` Qwen2.5-0.5B-Instruct ` | ✅ |
130131| ` Qwen2.5-1.5B-Instruct ` | ✅ |
132+ | ` Qwen2.5-3B ` | ❌ |
131133| ` Qwen2.5-3B-Instruct ` | ✅ |
132134| ` Qwen2.5-7B-Instruct ` | ✅ |
133135| ` Qwen2.5-14B-Instruct ` | ✅ |
@@ -138,12 +140,14 @@ This document tracks all model weights available in the `/model-weights` directo
138140| Model | Configuration |
139141| :------| :-------------|
140142| ` Qwen2.5-Math-1.5B-Instruct ` | ✅ |
143+ | ` Qwen2.5-Math-7B ` | ❌ |
141144| ` Qwen2.5-Math-7B-Instruct ` | ✅ |
142145| ` Qwen2.5-Math-72B-Instruct ` | ✅ |
143146
144147### Qwen: Qwen2.5-Coder
145148| Model | Configuration |
146149| :------| :-------------|
150+ | ` Qwen2.5-Coder-3B-Instruct ` | ✅ |
147151| ` Qwen2.5-Coder-7B-Instruct ` | ✅ |
148152
149153### Qwen: QwQ
@@ -162,6 +166,12 @@ This document tracks all model weights available in the `/model-weights` directo
162166| ` Qwen2-Math-72B-Instruct ` | ❌ |
163167| ` Qwen2-VL-7B-Instruct ` | ❌ |
164168
169+ ### Qwen: Qwen2.5-VL
170+ | Model | Configuration |
171+ | :------| :-------------|
172+ | ` Qwen2.5-VL-3B-Instruct ` | ❌ |
173+ | ` Qwen2.5-VL-7B-Instruct ` | ✅ |
174+
165175### Qwen: Qwen3
166176| Model | Configuration |
167177| :------| :-------------|
@@ -191,27 +201,76 @@ This document tracks all model weights available in the `/model-weights` directo
191201| Model | Configuration |
192202| :------| :-------------|
193203| ` gpt-oss-120b ` | ✅ |
204+ | ` gpt-oss-20b ` | ✅ |
194205
195- ### Other LLM Models
206+
207+ #### AI21: Jamba
196208| Model | Configuration |
197209| :------| :-------------|
198210| ` AI21-Jamba-1.5-Mini ` | ❌ |
199- | ` aya-expanse-32b ` | ✅ (as Aya-Expanse-32B) |
211+
212+ #### Cohere for AI: Aya
213+ | Model | Configuration |
214+ | :------| :-------------|
215+ | ` aya-expanse-32b ` | ✅ |
216+
217+ #### OpenAI: GPT-2
218+ | Model | Configuration |
219+ | :------| :-------------|
200220| ` gpt2-large ` | ❌ |
201221| ` gpt2-xl ` | ❌ |
202- | ` gpt-oss-120b ` | ❌ |
203- | ` instructblip-vicuna-7b ` | ❌ |
222+
223+ #### InternLM: InternLM2
224+ | Model | Configuration |
225+ | :------| :-------------|
204226| ` internlm2-math-plus-7b ` | ❌ |
227+
228+ #### Janus
229+ | Model | Configuration |
230+ | :------| :-------------|
205231| ` Janus-Pro-7B ` | ❌ |
232+
233+ #### Moonshot AI: Kimi
234+ | Model | Configuration |
235+ | :------| :-------------|
206236| ` Kimi-K2-Instruct ` | ❌ |
237+
238+ #### Mistral AI: Ministral
239+ | Model | Configuration |
240+ | :------| :-------------|
207241| ` Ministral-8B-Instruct-2410 ` | ❌ |
208- | ` Molmo-7B-D-0924 ` | ✅ |
242+
243+ #### AI2: OLMo
244+ | Model | Configuration |
245+ | :------| :-------------|
209246| ` OLMo-1B-hf ` | ❌ |
210247| ` OLMo-7B-hf ` | ❌ |
211248| ` OLMo-7B-SFT ` | ❌ |
249+
250+ #### EleutherAI: Pythia
251+ | Model | Configuration |
252+ | :------| :-------------|
212253| ` pythia ` | ❌ |
254+
255+ #### Qwen: Qwen1.5
256+ | Model | Configuration |
257+ | :------| :-------------|
213258| ` Qwen1.5-72B-Chat ` | ❌ |
259+
260+ #### ReasonFlux
261+ | Model | Configuration |
262+ | :------| :-------------|
214263| ` ReasonFlux-PRM-7B ` | ❌ |
264+
265+ #### LMSYS: Vicuna
266+ | Model | Configuration |
267+ | :------| :-------------|
268+ | ` vicuna-13b-v1.5 ` | ❌ |
269+
270+ #### Google: T5 (Encoder-Decoder Models)
271+ ** Note** : These are encoder-decoder (T5) models, not decoder-only LLMs.
272+ | Model | Configuration |
273+ | :------| :-------------|
215274| ` t5-large-lm-adapt ` | ❌ |
216275| ` t5-xl-lm-adapt ` | ❌ |
217276| ` mt5-xl-lm-adapt ` | ❌ |
@@ -238,10 +297,10 @@ This document tracks all model weights available in the `/model-weights` directo
238297### Meta: Llama 3.2 Vision
239298| Model | Configuration |
240299| :------| :-------------|
241- | ` Llama-3.2-11B-Vision ` | ✅ |
242- | ` Llama-3.2-11B-Vision-Instruct ` | ✅ |
243- | ` Llama-3.2-90B-Vision ` | ✅ |
244- | ` Llama-3.2-90B-Vision-Instruct ` | ✅ |
300+ | ` Llama-3.2-11B-Vision ` | ❌ |
301+ | ` Llama-3.2-11B-Vision-Instruct ` | ✅ | (SGLang only)
302+ | ` Llama-3.2-90B-Vision ` | ❌ |
303+ | ` Llama-3.2-90B-Vision-Instruct ` | ✅ | (SGLang only)
245304
246305### Mistral: Pixtral
247306| Model | Configuration |
@@ -266,10 +325,19 @@ This document tracks all model weights available in the `/model-weights` directo
266325| ` deepseek-vl2 ` | ✅ |
267326| ` deepseek-vl2-small ` | ✅ |
268327
328+ ### Google: MedGemma
329+ | Model | Configuration |
330+ | :------| :-------------|
331+ | ` medgemma-4b-it ` | ✅ |
332+ | ` medgemma-27b-it ` | ✅ |
333+ | ` medgemma-27b-text-it ` | ❌ |
334+
269335### Other VLM Models
270336| Model | Configuration |
271337| :------| :-------------|
338+ | ` instructblip-vicuna-7b ` | ❌ |
272339| ` MiniCPM-Llama3-V-2_5 ` | ❌ |
340+ | ` Molmo-7B-D-0924 ` | ✅ |
273341
274342---
275343
@@ -298,6 +366,8 @@ This document tracks all model weights available in the `/model-weights` directo
298366| ` data2vec ` | ❌ |
299367| ` gte-modernbert-base ` | ❌ |
300368| ` gte-Qwen2-7B-instruct ` | ❌ |
369+ | ` KaLM-Embedding-Gemma3-12B-2511 ` | ❌ |
370+ | ` llama-embed-nemotron-8b ` | ❌ |
301371| ` m2-bert-80M-32k-retrieval ` | ❌ |
302372| ` m2-bert-80M-8k-retrieval ` | ❌ |
303373
@@ -313,7 +383,7 @@ This document tracks all model weights available in the `/model-weights` directo
313383
314384---
315385
316- ## Multimodal Models
386+ ## Vision Models
317387
318388### CLIP
319389| Model | Configuration |
0 commit comments