add models (Minimax M2.1, Bytedance, Mistral) #926

leonardmq · 2026-01-09T03:53:44Z

What does this PR do?

Add models and providers:

add Cerebras for GLM 4.7 and GLM 4.6
add Minimax M2.1 on OpenRouter and Fireworks
add ByteDance Seed series on OpenRouter: Seed 1.6 and Seed 1.6 Flash
add Mistral 3 series: Mistral 3 Large 2512, Mistral Small Creative, Ministral 3 (14B, 8B, 3B)

Checklists

Tests have been run locally and passed
New tests have been added to any work in /lib

Summary by CodeRabbit

Release Notes

New Features
- Added support for 8 new AI models: Mistral 3 Large, Mistral Small Creative, Ministral 3 variants, Minimax M2.1, and ByteDance Seed models
- Expanded provider availability for existing models through Cerebras and Fireworks AI integration
- Enhanced structured output and reasoning capabilities across multiple model families

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-09T03:53:55Z

Walkthrough

This PR expands the ML model catalog by adding 8 new model variants to the ModelName enum (Mistral small creative, Ministral 3.x series, Minimax M2.1, ByteDance Seed 1.6 variants) and extends provider configurations for existing models with Cerebras GLM support, additional Fireworks AI integration, and OpenRouter/SiliconFlow coverage.

Changes

Cohort / File(s)	Summary
Model Registry Expansion `libs/core/kiln_ai/adapters/ml_model_list.py`	Added 8 new ModelName enum entries (mistral_3_large_2512, mistral_small_creative, ministral_3_14b_2512, ministral_3_8b_2512, ministral_3_3b_2512, minimax_m2_1, bytedance_seed_1_6, bytedance_seed_1_6_flash) with corresponding KilnModel entries. Extended Cerebras provider coverage for GLM 4.7 and GLM 4.6 models. Added Fireworks AI provider for Minimax M2. Introduced Minimax M2.1 model with OpenRouter and Fireworks AI providers. Added ByteDance Seed OSS 36B, Seed 1.6, and Seed 1.6 Flash models with OpenRouter/SiliconFlow CN provider configurations. Updated friendly_name for ByteDance Seed OSS 36B.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Adding support for GLM 4.7 #923: Adds GLM 4.7 model support with provider entries; this PR similarly extends GLM 4.7 with Cerebras provider configuration.
add gemini flash + nemotron 3 to ml_model_list #915: Adds new ModelName enum members and corresponding KilnModel entries following the same architectural pattern.
chore: add Kimi K2 0905 and Seed OSS 36B and Step3 models #591: Adds ByteDance Seed OSS 36B model entries and provider configurations that overlap with this PR's ByteDance model updates.

Suggested reviewers

sfierro
scosman
tawnymanticore

Poem

🐰 Hop hop, models bloom and grow,
Mistral, Seed, and Minimax flow,
Cerebras joins the provider parade,
Eight new friends our catalog made! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description covers what the PR does with specific model additions, but is missing the Related Issues section and the Contributor License Agreement statement required by the template.	Add the Related Issues section (even if empty) and include the CLA confirmation statement to fully comply with the template structure.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—adding multiple new AI models (Minimax M2.1, ByteDance, and Mistral families)—and is concise and clear.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-09T03:54:01Z

Summary of Changes

Hello @leonardmq, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the platform's AI model offerings by integrating several new models and providers. It enhances the system's capabilities by adding support for various Mistral 3 series models, the Minimax M2.1, and ByteDance Seed models, alongside new provider options for existing GLM models. This broadens the range of available language models for users, providing more diverse options for different use cases.

Highlights

New Mistral 3 Series Models: Added support for Mistral 3 Large 2512, Mistral Small Creative, and Ministral 3 (14B, 8B, 3B) models, primarily through OpenRouter, with Together AI also supported for Ministral 3 14B 2512.
Minimax M2.1 Integration: Integrated the Minimax M2.1 model, available via OpenRouter and Fireworks AI, including specific configurations for structured output and reasoning capabilities.
ByteDance Seed Series: Introduced ByteDance Seed 1.6 and Seed 1.6 Flash models, both accessible through OpenRouter.
Cerebras Provider for GLM Models: Expanded provider options for GLM 4.7 and GLM 4.6 models by adding Cerebras as a supported provider.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for several new models from providers like Minimax, Bytedance, Mistral, and Cerebras. The changes are mostly additions to the model list and look good overall. I've pointed out a minor naming inconsistency for one of the new Mistral models to ensure clarity and alignment with the provider's naming.

gemini-code-assist · 2026-01-09T03:55:41Z

libs/core/kiln_ai/adapters/ml_model_list.py

    phi_4_5p6b = "phi_4_5p6b"
    phi_4_mini = "phi_4_mini"
    mistral_large = "mistral_large"
+    mistral_3_large_2512 = "mistral_3_large_2512"


There's an inconsistency in this model's naming. On OpenRouter, the model ID is mistralai/mistral-large-2512 and it's named "Mistral Large 2512", without the "3". For consistency with the provider's naming, it's better to remove the _3_ from the enum member name.

Suggested change

mistral_3_large_2512 = "mistral_3_large_2512"

mistral_large_2512 = "mistral_large_2512"

disagree with gemini, it is called Mistral Large 3 https://openrouter.ai/mistralai/mistral-large-2512

gemini-code-assist · 2026-01-09T03:55:41Z

libs/core/kiln_ai/adapters/ml_model_list.py

+        name=ModelName.mistral_3_large_2512,
+        friendly_name="Mistral Large 3 2512",


To align with the suggested change in the ModelName enum and the provider's naming ("Mistral Large 2512"), the _3_ and 3 should be removed from name and friendly_name respectively.

Suggested change

name=ModelName.mistral_3_large_2512,

friendly_name="Mistral Large 3 2512",

name=ModelName.mistral_large_2512,

friendly_name="Mistral Large 2512",

github-actions · 2026-01-09T03:56:03Z

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/main...HEAD

No lines with coverage information in this diff.

📊 HTML Coverage Report - Interactive coverage report
📈 Diff Coverage Report - Detailed diff analysis
Github Actions Run - View the full coverage report

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @libs/core/kiln_ai/adapters/ml_model_list.py:
- Around line 2333-2390: Update the OpenRouter Mistral/Ministral model entries
to use the stable structured output mode: for KilnModel instances named
ModelName.ministral_3_14b_2512, ModelName.ministral_3_8b_2512, and
ModelName.ministral_3_3b_2512, change the KilnModelProvider entries that
currently set StructuredOutputMode.json_schema to
StructuredOutputMode.json_instruction_and_object (leave the existing
mistral_small_creative and the Together AI provider for
ModelName.ministral_3_14b_2512 unchanged).

🧹 Nitpick comments (1)

libs/core/kiln_ai/adapters/ml_model_list.py (1)

96-105: Enum additions look fine; keep naming consistent and validate downstream references.
The new ModelName.* entries are straightforward. Minor nit: mistral_small_creative doesn’t need the multiline/parenthesized assignment—consider keeping enum values single-line for consistency.

Also applies to: 219-223

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6431e1 and 609ec63.

📒 Files selected for processing (1)

libs/core/kiln_ai/adapters/ml_model_list.py

🧰 Additional context used

📓 Path-based instructions (2)

libs/core/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app

Use Python 3.10+ for the core library (libs/core)

Files:

libs/core/kiln_ai/adapters/ml_model_list.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation

**/*.py: Use Pydantic v2, not v1
Use asyncio for asynchronous operations in Python

Files:

libs/core/kiln_ai/adapters/ml_model_list.py

🧠 Learnings (10)

📓 Common learnings

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/upload_file_dialog.svelte:107-120
Timestamp: 2025-09-10T08:32:18.688Z
Learning: leonardmq prefers to work within the constraints of their SDK codegen for API calls, even when typing is awkward (like casting FormData to match expected types), rather than using alternative approaches like native fetch that would cause compiler errors with their generated types.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/datamodel/chunk.py:138-160
Timestamp: 2025-08-22T11:17:56.862Z
Learning: leonardmq prefers to avoid redundant validation checks when upstream systems already guarantee preconditions are met. He trusts the attachment system to ensure paths are properly formatted and prefers letting the attachment method (resolve_path) handle any edge cases rather than adding defensive precondition checks.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 390
File: libs/core/kiln_ai/adapters/provider_tools.py:494-499
Timestamp: 2025-07-23T08:58:45.769Z
Learning: leonardmq prefers to keep tightly coupled implementation details (like API versions) hardcoded when they are not user-facing and could break other modules if changed. The shared Config class is reserved for user-facing customizable values, not internal implementation details.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 341
File: libs/server/kiln_server/document_api.py:44-51
Timestamp: 2025-06-18T08:22:58.510Z
Learning: leonardmq prefers to defer fixing blocking I/O in async handlers when: the operation is very fast (milliseconds), user-triggered rather than automated, has no concurrency concerns, and would require additional testing to fix properly. He acknowledges such issues as valid but makes pragmatic decisions about timing the fixes.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/datamodel/rag.py:33-35
Timestamp: 2025-09-04T06:45:44.212Z
Learning: leonardmq requires vector_store_config_id to be a mandatory field in RagConfig (similar to extractor_config_id, chunker_config_id, embedding_config_id) for consistency. He prefers fixing dependent code that breaks due to missing required fields rather than making fields optional to accommodate incomplete data.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 654
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:141-152
Timestamp: 2025-09-25T06:38:14.854Z
Learning: leonardmq prefers simple onMount initialization patterns over reactive statements when possible, and is cautious about maintaining internal state for idempotency in Svelte components. He values simplicity and safety in component lifecycle management.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 402
File: libs/core/kiln_ai/adapters/embedding/litellm_embedding_adapter.py:0-0
Timestamp: 2025-07-14T03:43:07.283Z
Learning: leonardmq prefers to keep defensive validation checks even when they're technically redundant, viewing them as useful "quick sanity checks" that provide additional safety nets. He values defensive programming over strict DRY (Don't Repeat Yourself) principles when the redundant code serves as a safeguard.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/lib/api_schema.d.ts:0-0
Timestamp: 2025-09-10T08:27:47.227Z
Learning: leonardmq correctly identifies auto-generated files and prefers not to manually edit them. When suggesting changes to generated TypeScript API schema files, the fix should be made in the source OpenAPI specification in the backend, not in the generated TypeScript file.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:136-150
Timestamp: 2025-09-04T06:34:12.318Z
Learning: leonardmq prefers brute force re-insertion of all chunks when partial chunks exist in the LanceDB vector store, rather than selectively inserting only missing chunks. His reasoning is that documents typically have only dozens of chunks (making overwriting cheap), and partial insertion likely indicates data corruption or conflicts that should be resolved by re-inserting all chunks to ensure consistency.

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:98-103
Timestamp: 2025-09-04T07:08:04.248Z
Learning: leonardmq prefers to let TableNotFoundError bubble up in delete_nodes_by_document_id operations rather than catching it, as this error indicates operations are being done in the wrong order on the caller side and should be surfaced as a programming error rather than handled gracefully.

📚 Learning: 2025-08-08T15:50:45.334Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-08-08T16:13:26.526Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-08-08T16:19:20.074Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-08-08T16:14:54.346Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-10-28T05:25:09.283Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 757
File: libs/core/kiln_ai/adapters/reranker_list.py:39-42
Timestamp: 2025-10-28T05:25:09.283Z
Learning: In the Kiln codebase (Python), model family and model name fields in reranker and similar model definition classes (like KilnRerankerModel in libs/core/kiln_ai/adapters/reranker_list.py) should use plain str types rather than Enum types. This is because users receive over-the-air (OTA) updates for new models, and using Enums would break when clients with older builds encounter new model families their Enum definitions don't include. Strings provide forward compatibility for the OTA update mechanism.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-08-08T16:15:20.796Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-09-10T08:06:53.717Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 585
File: libs/core/kiln_ai/adapters/ml_embedding_model_list.py:32-33
Timestamp: 2025-09-10T08:06:53.717Z
Learning: In the Kiln AI codebase, friendly_name fields in embedding model configurations should maintain the original model naming conventions from providers (e.g., "gemini-embedding-001", "text-embedding-3-small") for human readability, rather than following snake_case conventions like the enum values.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-07-16T09:37:39.816Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:0-0
Timestamp: 2025-07-16T09:37:39.816Z
Learning: The `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due to output formatting issues: output was duplicated in both `output` and `reasoning` fields, and contained random internal JSON in the output. This model should not be re-added without addressing these formatting problems.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

📚 Learning: 2025-08-22T11:15:53.584Z

Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/adapters/embedding/embedding_registry.py:14-20
Timestamp: 2025-08-22T11:15:53.584Z
Learning: In the Kiln AI codebase, model_provider_name fields in datamodels like EmbeddingConfig are typed as ModelProviderName enum (which inherits from str, Enum), not as plain strings. Therefore, accessing .value on these fields is valid and returns the string representation of the enum value.

Applied to files:

libs/core/kiln_ai/adapters/ml_model_list.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Generate Coverage Report

🔇 Additional comments (3)

libs/core/kiln_ai/adapters/ml_model_list.py (3)

5702-5709: Verify whether json_schema is supported for accounts/fireworks/models/minimax-m2.

The codebase shows inconsistent Fireworks configurations: explicit comments indicate Fireworks doesn't support json_schema for models like llama-v3p1-70b-instruct (which use function_calling_weak), yet other Fireworks models like llama4-maverick-instruct-basic are configured with StructuredOutputMode.json_schema. If minimax-m2 cannot actually support json_schema on Fireworks, this configuration will break structured output for that provider path.

5652-5676: Minimax M2.1 parser requires validation—copy from M2 should be test-backed.

The OpenRouter configuration correctly uses parser=ModelParserID.r1_thinking (provider-specific, not applied to Fireworks), matching the M2 pattern. However, per established learnings, parser selection should rest on observed responses via tests, not model similarity alone. Validate that OpenRouter's minimax M2.1 actually returns R1-style <thinking> blocks before relying on this parser.

5255-5260: Cerebras GLM providers: verify model_id format and JSON schema support against Cerebras API.

The configuration references model_id="zai-glm-4.7" and model_id="zai-glm-4.6" with StructuredOutputMode.json_schema, but these need validation. Unlike other providers (e.g., SiliconFlow uses "zai-org/GLM-4.7"), Cerebras model IDs use a different format. Confirm:

Whether Cerebras API accepts these exact model ID strings

Whether Cerebras supports JSON schema structured output mode

Whether reasoning_capable=True behaves as expected on Cerebras

Also applies to: lines 5280-5285

coderabbitai · 2026-01-09T04:01:22Z

libs/core/kiln_ai/adapters/ml_model_list.py

+    # Mistral Small Creative
+    KilnModel(
+        family=ModelFamily.mistral,
+        name=ModelName.mistral_small_creative,
+        friendly_name="Mistral Small Creative",
+        providers=[
+            KilnModelProvider(
+                name=ModelProviderName.openrouter,
+                model_id="mistralai/mistral-small-creative",
+                structured_output_mode=StructuredOutputMode.json_instruction_and_object,
+            ),
+        ],
+    ),
+    # Ministral 3 14B 2512
+    KilnModel(
+        family=ModelFamily.mistral,
+        name=ModelName.ministral_3_14b_2512,
+        friendly_name="Ministral 3 14B 2512",
+        providers=[
+            KilnModelProvider(
+                name=ModelProviderName.openrouter,
+                model_id="mistralai/ministral-14b-2512",
+                structured_output_mode=StructuredOutputMode.json_schema,
+            ),
+            KilnModelProvider(
+                name=ModelProviderName.together_ai,
+                model_id="mistralai/Ministral-3-14B-Instruct-2512",
+                structured_output_mode=StructuredOutputMode.json_instruction_and_object,
+                supports_function_calling=False,
+            ),
+        ],
+    ),
+    # Ministral 3 8B 2512
+    KilnModel(
+        family=ModelFamily.mistral,
+        name=ModelName.ministral_3_8b_2512,
+        friendly_name="Ministral 3 8B 2512",
+        providers=[
+            KilnModelProvider(
+                name=ModelProviderName.openrouter,
+                model_id="mistralai/ministral-8b-2512",
+                structured_output_mode=StructuredOutputMode.json_schema,
+            ),
+        ],
+    ),
+    # Ministral 3 3B 2512
+    KilnModel(
+        family=ModelFamily.mistral,
+        name=ModelName.ministral_3_3b_2512,
+        friendly_name="Ministral 3 3B 2512",
+        providers=[
+            KilnModelProvider(
+                name=ModelProviderName.openrouter,
+                model_id="mistralai/ministral-3b-2512",
+                structured_output_mode=StructuredOutputMode.json_schema,
+            ),
+        ],
+    ),


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find the StructuredOutputMode enum definition and adapter implementations rg -n "class StructuredOutputMode" --type=py fd -e py -path "*adapter*" | head -20

Repository: Kiln-AI/Kiln

Length of output: 330

🏁 Script executed:

# Check OpenRouter and Together AI adapter implementations rg -n "openrouter|together_ai" --type=py libs/core/kiln_ai/adapters/ -l

Repository: Kiln-AI/Kiln

Length of output: 1101

🏁 Script executed:

# Look for how structured_output_mode is handled in adapters rg -n "structured_output_mode" --type=py libs/core/kiln_ai/adapters/ -A 3

Repository: Kiln-AI/Kiln

Length of output: 50369

🏁 Script executed:

# Get the StructuredOutputMode enum definition cat -n libs/core/kiln_ai/datamodel/datamodel_enums.py | head -50

Repository: Kiln-AI/Kiln

Length of output: 2659

🏁 Script executed:

# Find OpenRouter and Together AI adapter implementations find libs/core/kiln_ai/adapters -name "*openrouter*" -o -name "*together*"

Repository: Kiln-AI/Kiln

Length of output: 176

🏁 Script executed:

# Search for how structured_output_mode is processed in adapters rg "def.*structured_output_mode|json_schema|json_instruction_and_object" libs/core/kiln_ai/adapters/model_adapters/ -A 3 -B 1

Repository: Kiln-AI/Kiln

Length of output: 35341

🏁 Script executed:

# Check lines 2333-2390 and nearby to see what models are being added sed -n '2333,2390p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 2113

🏁 Script executed:

# Check lines 2450-2462 mentioned in the comment sed -n '2450,2462p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 487

🏁 Script executed:

# Search for provider-specific validation or mode handling rg "openrouter|together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 2 -A 5 | head -100

Repository: Kiln-AI/Kiln

Length of output: 10068

🏁 Script executed:

# Check if there's any validation that restricts modes per provider rg "structured_output_mode.*openrouter|openrouter.*structured_output_mode|together_ai.*structured_output_mode" -i

Repository: Kiln-AI/Kiln

Length of output: 38

🏁 Script executed:

# Look for any tests that verify OpenRouter or Together AI support rg "openrouter|together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3

Repository: Kiln-AI/Kiln

Length of output: 2317

🏁 Script executed:

# Check for any special handling or documented restrictions for OpenRouter json_schema rg "json_schema.*openrouter|openrouter.*json_schema" -i libs/core/

Repository: Kiln-AI/Kiln

Length of output: 257

🏁 Script executed:

# Look for Together AI specific mode handling rg "together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 3 -A 3

Repository: Kiln-AI/Kiln

Length of output: 871

🏁 Script executed:

# Check the test suite for Together AI structured output mode testing rg "together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3 | head -60

Repository: Kiln-AI/Kiln

Length of output: 309

🏁 Script executed:

# Check if there are any integration docs or known limitations fd "README|readme|INTEGRATION|integration" libs/core/kiln_ai/adapters/ | xargs cat 2>/dev/null

Repository: Kiln-AI/Kiln

Length of output: 3157

🏁 Script executed:

# Find the context of the OpenRouter json_schema limitation comment rg "json_schema.*sometimes works" -B 5 -A 5 libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 636

🏁 Script executed:

# Check if there are similar limitations for Together AI rg "together_ai.*json_schema|json_instruction_and_object.*together" -B 3 -A 3 libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 38

🏁 Script executed:

# Look for any existing OpenRouter models to see what modes they use rg "openrouter" libs/core/kiln_ai/adapters/ml_model_list.py | grep -i "structured_output_mode" | head -20

Repository: Kiln-AI/Kiln

Length of output: 38

🏁 Script executed:

# Check for any tests that validate OpenRouter or Together AI model configurations rg "test.*openrouter|test.*together_ai" libs/core/kiln_ai/adapters/ -l | xargs grep -l "structured_output_mode"

Repository: Kiln-AI/Kiln

Length of output: 274

🏁 Script executed:

# Look for all OpenRouter models and their structured_output_mode settings rg "ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 5 | grep -A 1 "openrouter" | grep "structured_output_mode"

Repository: Kiln-AI/Kiln

Length of output: 1132

🏁 Script executed:

# Check what modes are used across all OpenRouter models rg "name=ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 8 | grep "structured_output_mode" | sort | uniq -c

Repository: Kiln-AI/Kiln

Length of output: 497

🏁 Script executed:

# Get the full context of that OpenRouter limitation comment sed -n '1290,1310p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 885

🏁 Script executed:

# Check which Mistral models currently exist with OpenRouter rg "mistral.*openrouter|openrouter.*mistral" -i libs/core/kiln_ai/adapters/ml_model_list.py -B 2 -A 5

Repository: Kiln-AI/Kiln

Length of output: 38

Use json_instruction_and_object for OpenRouter Mistral/Ministral models to avoid known json_schema instability.

OpenRouter has a documented limitation where json_schema is unreliable ("JSON_schema sometimes works so force that, but not consistently so still not recommended" — see existing Llama 3.3 model config). The newly added OpenRouter models should follow the safer pattern already established in the codebase:

mistral_small_creative (OpenRouter) correctly uses json_instruction_and_object ✓

ministral_3_14b_2512, ministral_3_8b_2512, ministral_3_3b_2512, and mistral_3_large_2512 (OpenRouter) should switch from json_schema to json_instruction_and_object to align with known provider stability

ministral_3_14b_2512 (Together AI) correctly uses json_instruction_and_object ✓

The adapter explicitly supports json_instruction_and_object mode, which pairs JSON instructions in the prompt with API-level json_object response format—a pattern already used in 24 other OpenRouter model configurations.

🤖 Prompt for AI Agents

In @libs/core/kiln_ai/adapters/ml_model_list.py around lines 2333 - 2390, Update the OpenRouter Mistral/Ministral model entries to use the stable structured output mode: for KilnModel instances named ModelName.ministral_3_14b_2512, ModelName.ministral_3_8b_2512, and ModelName.ministral_3_3b_2512, change the KilnModelProvider entries that currently set StructuredOutputMode.json_schema to StructuredOutputMode.json_instruction_and_object (leave the existing mistral_small_creative and the Together AI provider for ModelName.ministral_3_14b_2512 unchanged).

leonardmq added 4 commits January 9, 2026 10:01

add Cerebras for GLM 4.7 and GLM 4.6

ac83fdf

add bytedance seed models

bab07e8

add mistral models

79548d3

add Minimax M2.1

609ec63

leonardmq requested a review from tawnymanticore January 9, 2026 03:53

gemini-code-assist bot reviewed Jan 9, 2026

View reviewed changes

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

tawnymanticore approved these changes Jan 9, 2026

View reviewed changes

leonardmq merged commit 811194a into main Jan 11, 2026
12 checks passed

leonardmq deleted the leonard/add-models-0109 branch January 11, 2026 02:57

leonardmq mentioned this pull request Jan 11, 2026

remote_config: new models #929

Merged

This was referenced Jan 12, 2026

4.7 should be above 4.6V #931

Merged

bug: GLM 4.7 erroneously below GLM 4.6V (#931) #932

Merged

Update remote config with updated recommended eval/SDG models #937

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add models (Minimax M2.1, Bytedance, Mistral) #926

add models (Minimax M2.1, Bytedance, Mistral) #926

Uh oh!

leonardmq commented Jan 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 9, 2026

Uh oh!

tawnymanticore Jan 9, 2026

Uh oh!

gemini-code-assist bot Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	mistral_3_large_2512 = "mistral_3_large_2512"
	mistral_large_2512 = "mistral_large_2512"

		name=ModelName.mistral_3_large_2512,
		friendly_name="Mistral Large 3 2512",

add models (Minimax M2.1, Bytedance, Mistral) #926

add models (Minimax M2.1, Bytedance, Mistral) #926

Uh oh!

Conversation

leonardmq commented Jan 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklists

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot commented Jan 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

tawnymanticore Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 9, 2026

📊 Coverage Report

Diff: origin/main...HEAD

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leonardmq commented Jan 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 9, 2026 •

edited

Loading