Skip to content

Commit d83e760

Browse files
authored
Sync up the cookbook model lineup with the one from the docs (and add Kimi K2.5) (#371)
1 parent 281cc06 commit d83e760

File tree

1 file changed

+28
-25
lines changed

1 file changed

+28
-25
lines changed

docs/model-lineup.mdx

Lines changed: 28 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -7,45 +7,48 @@ The table below shows the models that are currently available in Tinker. We plan
77
- In general, use MoE models, which are more cost effective than the dense models.
88
- Use Base models only if you're doing research or are running the full post-training pipeline yourself
99
- If you want to create a model that is good at a specific task or domain, use an existing post-trained model, and fine-tune it on your own data or environment.
10-
- If you care about latency, use one of the Instruction models, which will start outputting tokens without a chain-of-thought.
11-
- If you care about intelligence and robustness, use one of the Hybrid or Reasoning models, which can use long chain-of-thought.
10+
- If you care about latency, use one of the Instruction models, which will start outputting tokens without a chain-of-thought.
11+
- If you care about intelligence and robustness, use one of the Hybrid or Reasoning models, which can use long chain-of-thought.
1212

1313
## Full Listing
1414

15-
| Model Name | Training Type | Architecture | Size |
16-
| ----------------------------------------------------------------------------------------------- | ------------- | ------------ | --------- |
17-
| [Qwen/Qwen3-VL-235B-A22B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct) | Vision | MoE | Large |
18-
| [Qwen/Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct) | Vision | MoE | Medium |
19-
| [Qwen/Qwen3-235B-A22B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) | Instruction | MoE | Large |
20-
| [Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507) | Instruction | MoE | Medium |
21-
| [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) | Hybrid | MoE | Medium |
22-
| [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base) | Base | MoE | Medium |
23-
| [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) | Hybrid | Dense | Medium |
24-
| [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) | Hybrid | Dense | Small |
25-
| [Qwen/Qwen3-8B-Base](https://huggingface.co/Qwen/Qwen3-8B-Base) | Base | Dense | Small |
26-
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | Instruction | Dense | Compact |
27-
| [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | Reasoning | MoE | Medium |
28-
| [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) | Reasoning | MoE | Small |
29-
| [deepseek-ai/DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) | Hybrid | MoE | Large |
30-
| [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) | Base | MoE | Large |
31-
| [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) | Base | Dense | Large |
32-
| [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | Instruction | Dense | Large |
33-
| [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | Base | Dense | Small |
34-
| [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | Instruction | Dense | Small |
35-
| [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) | Base | Dense | Compact |
36-
| [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) | Base | Dense | Compact |
37-
| [moonshotai/Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking) | Reasoning | MoE | Large |
15+
| Model Name | Training Type | Architecture | Size |
16+
| ----------------------------------------------------------------------------------------------- | ------------- | ------------ | ------- |
17+
| [Qwen/Qwen3-VL-235B-A22B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct) | Vision | MoE | Large |
18+
| [Qwen/Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct) | Vision | MoE | Medium |
19+
| [Qwen/Qwen3-235B-A22B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) | Instruction | MoE | Large |
20+
| [Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507) | Instruction | MoE | Medium |
21+
| [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) | Hybrid | MoE | Medium |
22+
| [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base) | Base | MoE | Medium |
23+
| [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) | Hybrid | Dense | Medium |
24+
| [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) | Hybrid | Dense | Small |
25+
| [Qwen/Qwen3-8B-Base](https://huggingface.co/Qwen/Qwen3-8B-Base) | Base | Dense | Small |
26+
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | Instruction | Dense | Compact |
27+
| [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | Reasoning | MoE | Medium |
28+
| [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) | Reasoning | MoE | Small |
29+
| [deepseek-ai/DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) | Hybrid | MoE | Large |
30+
| [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) | Base | MoE | Large |
31+
| [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) | Base | Dense | Large |
32+
| [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | Instruction | Dense | Large |
33+
| [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | Base | Dense | Small |
34+
| [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | Instruction | Dense | Small |
35+
| [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) | Base | Dense | Compact |
36+
| [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) | Base | Dense | Compact |
37+
| [moonshotai/Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking) | Reasoning | MoE | Large |
38+
| [moonshotai/Kimi-K2.5](https://huggingface.co/moonshotai/Kimi-K2.5) | Reasoning + Vision | MoE | Large |
3839

3940
## Legend
4041

4142
### Training Types
43+
4244
- **Base**: Foundation models trained on raw text data, suitable for post-training research and custom fine-tuning.
4345
- **Instruction**: Models fine-tuned for following instructions and chat, optimized for fast inference.
4446
- **Reasoning**: Models that always use chain-of-thought reasoning before their "visible" output that responds to the prompt.
4547
- **Hybrid**: Models that can operate in both thinking and non-thinking modes, where the non-thinking mode requires using a special renderer or argument that disables chain-of-thought.
4648
- **Vision**: Vision-language models (VLMs) that can process images alongside text. See [Vision Inputs](/rendering#vision-inputs) for usage.
4749

4850
### Architecture
51+
4952
- **Dense**: Standard transformer architecture with all parameters active
5053
- **MoE**: Mixture of Experts architecture with sparse activation
5154

0 commit comments

Comments
 (0)