Skip to content

Commit 452795c

Browse files
authored
docs: refresh model guidance for codex max (#485)
1 parent 7d7a6ec commit 452795c

File tree

3 files changed

+39
-29
lines changed

3 files changed

+39
-29
lines changed

docs/cli/user-guides/choosing-your-model.mdx

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,23 @@ description: Balance accuracy, speed, and cost by picking the right model and re
44
keywords: ['model', 'models', 'llm', 'claude', 'sonnet', 'opus', 'haiku', 'gpt', 'openai', 'anthropic', 'choose model', 'switch model']
55
---
66

7-
Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Wednesday, October 23rd 2025.
7+
Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Thursday, December 4th 2025.
88

99
---
1010

11-
## 1 · Current stack rank (October 2025)
11+
## 1 · Current stack rank (December 2025)
1212

13-
| Rank | Model | Why we reach for it |
14-
| ---- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |
15-
| 1 | **Claude Sonnet 4.5** | Recommended daily driver. Excellent balance of quality, speed, and cost for most development tasks. Current CLI default. |
16-
| 2 | **GPT-5 Codex** | Fast iteration loops with strong coding performance. Great for implementation-heavy work at lower cost than Sonnet. |
17-
| 3 | **Claude Haiku 4.5** | Fast and cost-effective for routine tasks, quick iterations, and high-volume automation. Best for speed-sensitive workflows. |
18-
| 4 | **Droid Core (GLM-4.6)** | Open-source model with 0.25× token multiplier. Lightning-fast and budget-friendly for automation, bulk edits, and air-gapped environments. |
19-
| 5 | **GPT-5** | Strong generalist from OpenAI. Choose when you prefer OpenAI ergonomics or need specific GPT features. |
20-
| 6 | **Claude Opus 4.1** | Highest capability for extremely complex work. Use when you need maximum reasoning power for critical architecture decisions or tough problems. |
13+
| Rank | Model | Why we reach for it |
14+
| ---- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
15+
| 1 | **Claude Opus 4.5 (default)** | Highest quality-and-safety balance; current CLI default for both TUI and exec. |
16+
| 2 | **GPT-5.1-Codex-Max** | Fast coding loops with support up to **Extra High** reasoning; great for heavy implementation and debugging. |
17+
| 3 | **Claude Sonnet 4.5** | Strong daily driver with balanced cost/quality; great general-purpose choice when you don’t need Opus-level depth. |
18+
| 4 | **GPT-5.1-Codex** | Quick iteration with solid code quality at lower cost; bump reasoning when you need more depth. |
19+
| 5 | **GPT-5.1** | Good generalist, especially when you want OpenAI ergonomics with flexible reasoning effort. |
20+
| 6 | **Claude Haiku 4.5** | Fast, cost-efficient for routine tasks and high-volume automation. |
21+
| 7 | **Gemini 3 Pro** | Strong at mixed reasoning with Low/High settings; helpful for researchy flows with structured outputs. |
22+
| 8 | **Claude Opus 4.1** | Highest raw capability for extremely complex work; choose when you need maximum reasoning power despite higher cost. |
23+
| 9 | **Droid Core (GLM-4.6)** | Open-source, 0.25× multiplier, great for bulk automation or air-gapped environments; note: no image support. |
2124

2225
<Note>
2326
We ship model updates regularly. When a new release overtakes the list above,
@@ -30,11 +33,11 @@ Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shi
3033

3134
| Scenario | Recommended model |
3235
| ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
33-
| **Deep planning, architecture reviews, ambiguous product specs** | Start with **Sonnet 4.5** for strong reasoning at practical cost. Use **GPT-5 Codex** for faster iteration or **Haiku 4.5** for lighter tasks. |
34-
| **Full-feature development, large refactors** | **Sonnet 4.5** is the recommended daily driver. Try **GPT-5 Codex** when you want faster loops or **Droid Core** for high-volume work. |
35-
| **Repeatable edits, summarization, boilerplate generation** | **Haiku 4.5** or **Droid Core** for speed and cost savings. **GPT-5** or **Sonnet 4.5** when you need higher quality. |
36-
| **CI/CD or automation loops** | Favor **Haiku 4.5** or **Droid Core** for predictable throughput at low cost. Use **Sonnet 4.5** or **Codex** for complex automation. |
37-
| **High-volume automation, frequent quick turns** | **Haiku 4.5** for speedy feedback loops. **Droid Core** when cost is critical or you need air-gapped deployment. |
36+
| **Deep planning, architecture reviews, ambiguous product specs** | Start with **Opus 4.5 (default)** for depth and safety. Use **Sonnet 4.5** when you want balanced cost/quality, or **Codex/Codex-Max** for faster iteration with reasoning. |
37+
| **Full-feature development, large refactors** | **Opus 4.5** for default depth and safety. **GPT-5.1-Codex-Max** when you need speed plus **Extra High** reasoning; **Sonnet 4.5** for balanced loops. |
38+
| **Repeatable edits, summarization, boilerplate generation** | **Haiku 4.5** or **Droid Core** for speed and cost. **GPT-5.1 / GPT-5.1-Codex** when you need higher quality or structured outputs. |
39+
| **CI/CD or automation loops** | Favor **Haiku 4.5** or **Droid Core** for predictable, low-cost throughput. Use **Codex** or **Codex-Max** when automation needs stronger reasoning. |
40+
| **High-volume automation, frequent quick turns** | **Haiku 4.5** for speedy feedback. **Droid Core** when cost is critical or you need air-gapped deployment. |
3841

3942
<Tip>
4043
**Claude Opus 4.1** remains available for extremely complex architecture decisions or critical work where you need maximum reasoning capability. Most tasks don't require Opus-level power—start with Sonnet 4.5 and escalate only if needed.
@@ -47,17 +50,22 @@ Tip: you can swap models mid-session with `/model` or by toggling in the setting
4750
## 3 · Switching models mid-session
4851

4952
- Use `/model` (or **Shift+Tab → Settings → Model**) to swap without losing your chat history.
50-
- If you change providers (e.g. Anthropc to OpenAI), the CLI converts the session transcript between Anthropic and OpenAI formats. The translation is lossy—provider-specific metadata is dropped—but we have not seen accuracy regressions in practice.
53+
- If you change providers (e.g. Anthropic to OpenAI), the CLI converts the session transcript between Anthropic and OpenAI formats. The translation is lossy—provider-specific metadata is dropped—but we have not seen accuracy regressions in practice.
5154
- For the best context continuity, switch models at natural milestones: after a commit, once a PR lands, or when you abandon a failed approach and reset the plan.
5255
- If you flip back and forth rapidly, expect the assistant to spend a turn re-grounding itself; consider summarizing recent progress when you switch.
5356

5457
---
5558

5659
## 4 · Reasoning effort settings
5760

58-
- Anthropic models (Opus/Sonnet/Haiku) show modest gains between Low and High.
59-
- GPT models respond much more to higher reasoning effort—bumping **GPT-5** or **GPT-5 Codex** to **High** can materially improve planning and debugging.
60-
- Reasoning effort increases latency and cost, so start Low for simple work and escalate when you need more depth.
61+
- **Opus / Sonnet / Haiku**: Off / Low / Medium / High (default: Off)
62+
- **GPT-5.1**: None / Low / Medium / High (default: None)
63+
- **GPT-5.1-Codex**: Low / Medium / High (default: Medium)
64+
- **GPT-5.1-Codex-Max**: Low / Medium / High / **Extra High** (default: Medium)
65+
- **Gemini 3 Pro**: Low / High (default: High)
66+
- **Droid Core (GLM-4.6)**: None only (default: None; no image support)
67+
68+
Reasoning effort increases latency and cost—start low for simple work and escalate as needed. **Extra High** is only available on GPT-5.1-Codex-Max.
6169

6270
<Tip>
6371
Change reasoning effort from `/model`**Reasoning effort**, or via the

docs/pricing.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Different models have different multipliers applied to calculate Standard Token
2929
| Claude Haiku 4.5 | `claude-haiku-4-5-20251001` | 0.4× |
3030
| GPT-5.1 | `gpt-5.1` | 0.5× |
3131
| GPT-5.1-Codex | `gpt-5.1-codex` | 0.5× |
32+
| GPT-5.1-Codex-Max | `gpt-5.1-codex-max` | 0.5× |
3233
| Gemini 3 Pro | `gemini-3-pro-preview` | 0.8× |
3334
| Claude Sonnet 4.5 | `claude-sonnet-4-5-20250929` | 1.2× |
3435
| Claude Opus 4.5 | `claude-opus-4-5-20251101` | 1.2× |

docs/reference/cli-reference.mdx

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -97,16 +97,17 @@ droid exec --auto high "Run tests, commit, and push changes"
9797

9898
## Available models
9999

100-
| Model ID | Name | Reasoning support | Default reasoning |
101-
| :---------------------------- | :---------------------- | :---------------- | :---------------- |
102-
| `claude-opus-4-5-20251101` | Claude Opus 4.5 (default) | Yes | off |
103-
| `gpt-5.1-codex` | GPT-5.1 Codex | Yes | medium |
104-
| `gpt-5.1` | GPT-5.1 | Yes | none |
105-
| `claude-sonnet-4-5-20250929` | Claude Sonnet 4.5 | Yes | off |
106-
| `claude-opus-4-1-20250805` | Claude Opus 4.1 | Yes | off |
107-
| `claude-haiku-4-5-20251001` | Claude Haiku 4.5 | Yes | off |
108-
| `gemini-3-pro-preview` | Gemini 3 Pro | Yes | high |
109-
| `glm-4.6` | Droid Core (GLM-4.6) | No | none |
100+
| Model ID | Name | Reasoning support | Default reasoning |
101+
| :---------------------------- | :--------------------------- | :-------------------------------- | :---------------- |
102+
| `claude-opus-4-5-20251101` | Claude Opus 4.5 (default) | Yes (Off/Low/Medium/High) | off |
103+
| `gpt-5.1-codex-max` | GPT-5.1-Codex-Max | Yes (Low/Medium/High/Extra High) | medium |
104+
| `gpt-5.1-codex` | GPT-5.1-Codex | Yes (Low/Medium/High) | medium |
105+
| `gpt-5.1` | GPT-5.1 | Yes (None/Low/Medium/High) | none |
106+
| `claude-sonnet-4-5-20250929` | Claude Sonnet 4.5 | Yes (Off/Low/Medium/High) | off |
107+
| `claude-opus-4-1-20250805` | Claude Opus 4.1 | Yes (Off/Low/Medium/High) | off |
108+
| `claude-haiku-4-5-20251001` | Claude Haiku 4.5 | Yes (Off/Low/Medium/High) | off |
109+
| `gemini-3-pro-preview` | Gemini 3 Pro | Yes (Low/High) | high |
110+
| `glm-4.6` | Droid Core (GLM-4.6) | None only | none |
110111

111112
Custom models configured via [BYOK](/cli/configuration/byok) use the format: `custom:<alias>`
112113

0 commit comments

Comments
 (0)