Skip to content

Conversation

@leonardmq
Copy link
Collaborator

@leonardmq leonardmq commented Jan 9, 2026

What does this PR do?

Add models and providers:

  • add Cerebras for GLM 4.7 and GLM 4.6
  • add Minimax M2.1 on OpenRouter and Fireworks
  • add ByteDance Seed series on OpenRouter: Seed 1.6 and Seed 1.6 Flash
  • add Mistral 3 series: Mistral 3 Large 2512, Mistral Small Creative, Ministral 3 (14B, 8B, 3B)

Checklists

  • Tests have been run locally and passed
  • New tests have been added to any work in /lib

Summary by CodeRabbit

Release Notes

  • New Features
    • Added support for 8 new AI models: Mistral 3 Large, Mistral Small Creative, Ministral 3 variants, Minimax M2.1, and ByteDance Seed models
    • Expanded provider availability for existing models through Cerebras and Fireworks AI integration
    • Enhanced structured output and reasoning capabilities across multiple model families

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 9, 2026

Walkthrough

This PR expands the ML model catalog by adding 8 new model variants to the ModelName enum (Mistral small creative, Ministral 3.x series, Minimax M2.1, ByteDance Seed 1.6 variants) and extends provider configurations for existing models with Cerebras GLM support, additional Fireworks AI integration, and OpenRouter/SiliconFlow coverage.

Changes

Cohort / File(s) Summary
Model Registry Expansion
libs/core/kiln_ai/adapters/ml_model_list.py
Added 8 new ModelName enum entries (mistral_3_large_2512, mistral_small_creative, ministral_3_14b_2512, ministral_3_8b_2512, ministral_3_3b_2512, minimax_m2_1, bytedance_seed_1_6, bytedance_seed_1_6_flash) with corresponding KilnModel entries. Extended Cerebras provider coverage for GLM 4.7 and GLM 4.6 models. Added Fireworks AI provider for Minimax M2. Introduced Minimax M2.1 model with OpenRouter and Fireworks AI providers. Added ByteDance Seed OSS 36B, Seed 1.6, and Seed 1.6 Flash models with OpenRouter/SiliconFlow CN provider configurations. Updated friendly_name for ByteDance Seed OSS 36B.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • sfierro
  • scosman
  • tawnymanticore

Poem

🐰 Hop hop, models bloom and grow,
Mistral, Seed, and Minimax flow,
Cerebras joins the provider parade,
Eight new friends our catalog made! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description covers what the PR does with specific model additions, but is missing the Related Issues section and the Contributor License Agreement statement required by the template. Add the Related Issues section (even if empty) and include the CLA confirmation statement to fully comply with the template structure.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change—adding multiple new AI models (Minimax M2.1, ByteDance, and Mistral families)—and is concise and clear.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the platform's AI model offerings by integrating several new models and providers. It enhances the system's capabilities by adding support for various Mistral 3 series models, the Minimax M2.1, and ByteDance Seed models, alongside new provider options for existing GLM models. This broadens the range of available language models for users, providing more diverse options for different use cases.

Highlights

  • New Mistral 3 Series Models: Added support for Mistral 3 Large 2512, Mistral Small Creative, and Ministral 3 (14B, 8B, 3B) models, primarily through OpenRouter, with Together AI also supported for Ministral 3 14B 2512.
  • Minimax M2.1 Integration: Integrated the Minimax M2.1 model, available via OpenRouter and Fireworks AI, including specific configurations for structured output and reasoning capabilities.
  • ByteDance Seed Series: Introduced ByteDance Seed 1.6 and Seed 1.6 Flash models, both accessible through OpenRouter.
  • Cerebras Provider for GLM Models: Expanded provider options for GLM 4.7 and GLM 4.6 models by adding Cerebras as a supported provider.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for several new models from providers like Minimax, Bytedance, Mistral, and Cerebras. The changes are mostly additions to the model list and look good overall. I've pointed out a minor naming inconsistency for one of the new Mistral models to ensure clarity and alignment with the provider's naming.

phi_4_5p6b = "phi_4_5p6b"
phi_4_mini = "phi_4_mini"
mistral_large = "mistral_large"
mistral_3_large_2512 = "mistral_3_large_2512"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistency in this model's naming. On OpenRouter, the model ID is mistralai/mistral-large-2512 and it's named "Mistral Large 2512", without the "3". For consistency with the provider's naming, it's better to remove the _3_ from the enum member name.

Suggested change
mistral_3_large_2512 = "mistral_3_large_2512"
mistral_large_2512 = "mistral_large_2512"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disagree with gemini, it is called Mistral Large 3 https://openrouter.ai/mistralai/mistral-large-2512

Comment on lines +2453 to +2454
name=ModelName.mistral_3_large_2512,
friendly_name="Mistral Large 3 2512",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To align with the suggested change in the ModelName enum and the provider's naming ("Mistral Large 2512"), the _3_ and 3 should be removed from name and friendly_name respectively.

Suggested change
name=ModelName.mistral_3_large_2512,
friendly_name="Mistral Large 3 2512",
name=ModelName.mistral_large_2512,
friendly_name="Mistral Large 2512",

@github-actions
Copy link

github-actions bot commented Jan 9, 2026

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/main...HEAD

No lines with coverage information in this diff.


Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @libs/core/kiln_ai/adapters/ml_model_list.py:
- Around line 2333-2390: Update the OpenRouter Mistral/Ministral model entries
to use the stable structured output mode: for KilnModel instances named
ModelName.ministral_3_14b_2512, ModelName.ministral_3_8b_2512, and
ModelName.ministral_3_3b_2512, change the KilnModelProvider entries that
currently set StructuredOutputMode.json_schema to
StructuredOutputMode.json_instruction_and_object (leave the existing
mistral_small_creative and the Together AI provider for
ModelName.ministral_3_14b_2512 unchanged).
🧹 Nitpick comments (1)
libs/core/kiln_ai/adapters/ml_model_list.py (1)

96-105: Enum additions look fine; keep naming consistent and validate downstream references.
The new ModelName.* entries are straightforward. Minor nit: mistral_small_creative doesn’t need the multiline/parenthesized assignment—consider keeping enum values single-line for consistency.

Also applies to: 219-223

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6431e1 and 609ec63.

📒 Files selected for processing (1)
  • libs/core/kiln_ai/adapters/ml_model_list.py
🧰 Additional context used
📓 Path-based instructions (2)
libs/core/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app

Use Python 3.10+ for the core library (libs/core)

Files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation

**/*.py: Use Pydantic v2, not v1
Use asyncio for asynchronous operations in Python

Files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
🧠 Learnings (10)
📓 Common learnings
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/upload_file_dialog.svelte:107-120
Timestamp: 2025-09-10T08:32:18.688Z
Learning: leonardmq prefers to work within the constraints of their SDK codegen for API calls, even when typing is awkward (like casting FormData to match expected types), rather than using alternative approaches like native fetch that would cause compiler errors with their generated types.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/datamodel/chunk.py:138-160
Timestamp: 2025-08-22T11:17:56.862Z
Learning: leonardmq prefers to avoid redundant validation checks when upstream systems already guarantee preconditions are met. He trusts the attachment system to ensure paths are properly formatted and prefers letting the attachment method (resolve_path) handle any edge cases rather than adding defensive precondition checks.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 390
File: libs/core/kiln_ai/adapters/provider_tools.py:494-499
Timestamp: 2025-07-23T08:58:45.769Z
Learning: leonardmq prefers to keep tightly coupled implementation details (like API versions) hardcoded when they are not user-facing and could break other modules if changed. The shared Config class is reserved for user-facing customizable values, not internal implementation details.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 341
File: libs/server/kiln_server/document_api.py:44-51
Timestamp: 2025-06-18T08:22:58.510Z
Learning: leonardmq prefers to defer fixing blocking I/O in async handlers when: the operation is very fast (milliseconds), user-triggered rather than automated, has no concurrency concerns, and would require additional testing to fix properly. He acknowledges such issues as valid but makes pragmatic decisions about timing the fixes.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/datamodel/rag.py:33-35
Timestamp: 2025-09-04T06:45:44.212Z
Learning: leonardmq requires vector_store_config_id to be a mandatory field in RagConfig (similar to extractor_config_id, chunker_config_id, embedding_config_id) for consistency. He prefers fixing dependent code that breaks due to missing required fields rather than making fields optional to accommodate incomplete data.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 654
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:141-152
Timestamp: 2025-09-25T06:38:14.854Z
Learning: leonardmq prefers simple onMount initialization patterns over reactive statements when possible, and is cautious about maintaining internal state for idempotency in Svelte components. He values simplicity and safety in component lifecycle management.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 402
File: libs/core/kiln_ai/adapters/embedding/litellm_embedding_adapter.py:0-0
Timestamp: 2025-07-14T03:43:07.283Z
Learning: leonardmq prefers to keep defensive validation checks even when they're technically redundant, viewing them as useful "quick sanity checks" that provide additional safety nets. He values defensive programming over strict DRY (Don't Repeat Yourself) principles when the redundant code serves as a safeguard.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/lib/api_schema.d.ts:0-0
Timestamp: 2025-09-10T08:27:47.227Z
Learning: leonardmq correctly identifies auto-generated files and prefers not to manually edit them. When suggesting changes to generated TypeScript API schema files, the fix should be made in the source OpenAPI specification in the backend, not in the generated TypeScript file.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:136-150
Timestamp: 2025-09-04T06:34:12.318Z
Learning: leonardmq prefers brute force re-insertion of all chunks when partial chunks exist in the LanceDB vector store, rather than selectively inserting only missing chunks. His reasoning is that documents typically have only dozens of chunks (making overwriting cheap), and partial insertion likely indicates data corruption or conflicts that should be resolved by re-inserting all chunks to ensure consistency.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:98-103
Timestamp: 2025-09-04T07:08:04.248Z
Learning: leonardmq prefers to let TableNotFoundError bubble up in delete_nodes_by_document_id operations rather than catching it, as this error indicates operations are being done in the wrong order on the caller side and should be surfaced as a programming error rather than handled gracefully.
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:19:20.074Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-10-28T05:25:09.283Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 757
File: libs/core/kiln_ai/adapters/reranker_list.py:39-42
Timestamp: 2025-10-28T05:25:09.283Z
Learning: In the Kiln codebase (Python), model family and model name fields in reranker and similar model definition classes (like KilnRerankerModel in libs/core/kiln_ai/adapters/reranker_list.py) should use plain str types rather than Enum types. This is because users receive over-the-air (OTA) updates for new models, and using Enums would break when clients with older builds encounter new model families their Enum definitions don't include. Strings provide forward compatibility for the OTA update mechanism.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:15:20.796Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-09-10T08:06:53.717Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 585
File: libs/core/kiln_ai/adapters/ml_embedding_model_list.py:32-33
Timestamp: 2025-09-10T08:06:53.717Z
Learning: In the Kiln AI codebase, friendly_name fields in embedding model configurations should maintain the original model naming conventions from providers (e.g., "gemini-embedding-001", "text-embedding-3-small") for human readability, rather than following snake_case conventions like the enum values.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-07-16T09:37:39.816Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:0-0
Timestamp: 2025-07-16T09:37:39.816Z
Learning: The `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due to output formatting issues: output was duplicated in both `output` and `reasoning` fields, and contained random internal JSON in the output. This model should not be re-added without addressing these formatting problems.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-22T11:15:53.584Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/adapters/embedding/embedding_registry.py:14-20
Timestamp: 2025-08-22T11:15:53.584Z
Learning: In the Kiln AI codebase, model_provider_name fields in datamodels like EmbeddingConfig are typed as ModelProviderName enum (which inherits from str, Enum), not as plain strings. Therefore, accessing .value on these fields is valid and returns the string representation of the enum value.

Applied to files:

  • libs/core/kiln_ai/adapters/ml_model_list.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Generate Coverage Report
🔇 Additional comments (3)
libs/core/kiln_ai/adapters/ml_model_list.py (3)

5702-5709: Verify whether json_schema is supported for accounts/fireworks/models/minimax-m2.

The codebase shows inconsistent Fireworks configurations: explicit comments indicate Fireworks doesn't support json_schema for models like llama-v3p1-70b-instruct (which use function_calling_weak), yet other Fireworks models like llama4-maverick-instruct-basic are configured with StructuredOutputMode.json_schema. If minimax-m2 cannot actually support json_schema on Fireworks, this configuration will break structured output for that provider path.


5652-5676: Minimax M2.1 parser requires validation—copy from M2 should be test-backed.

The OpenRouter configuration correctly uses parser=ModelParserID.r1_thinking (provider-specific, not applied to Fireworks), matching the M2 pattern. However, per established learnings, parser selection should rest on observed responses via tests, not model similarity alone. Validate that OpenRouter's minimax M2.1 actually returns R1-style <thinking> blocks before relying on this parser.


5255-5260: Cerebras GLM providers: verify model_id format and JSON schema support against Cerebras API.

The configuration references model_id="zai-glm-4.7" and model_id="zai-glm-4.6" with StructuredOutputMode.json_schema, but these need validation. Unlike other providers (e.g., SiliconFlow uses "zai-org/GLM-4.7"), Cerebras model IDs use a different format. Confirm:

  • Whether Cerebras API accepts these exact model ID strings
  • Whether Cerebras supports JSON schema structured output mode
  • Whether reasoning_capable=True behaves as expected on Cerebras

Also applies to: lines 5280-5285

Comment on lines +2333 to +2390
# Mistral Small Creative
KilnModel(
family=ModelFamily.mistral,
name=ModelName.mistral_small_creative,
friendly_name="Mistral Small Creative",
providers=[
KilnModelProvider(
name=ModelProviderName.openrouter,
model_id="mistralai/mistral-small-creative",
structured_output_mode=StructuredOutputMode.json_instruction_and_object,
),
],
),
# Ministral 3 14B 2512
KilnModel(
family=ModelFamily.mistral,
name=ModelName.ministral_3_14b_2512,
friendly_name="Ministral 3 14B 2512",
providers=[
KilnModelProvider(
name=ModelProviderName.openrouter,
model_id="mistralai/ministral-14b-2512",
structured_output_mode=StructuredOutputMode.json_schema,
),
KilnModelProvider(
name=ModelProviderName.together_ai,
model_id="mistralai/Ministral-3-14B-Instruct-2512",
structured_output_mode=StructuredOutputMode.json_instruction_and_object,
supports_function_calling=False,
),
],
),
# Ministral 3 8B 2512
KilnModel(
family=ModelFamily.mistral,
name=ModelName.ministral_3_8b_2512,
friendly_name="Ministral 3 8B 2512",
providers=[
KilnModelProvider(
name=ModelProviderName.openrouter,
model_id="mistralai/ministral-8b-2512",
structured_output_mode=StructuredOutputMode.json_schema,
),
],
),
# Ministral 3 3B 2512
KilnModel(
family=ModelFamily.mistral,
name=ModelName.ministral_3_3b_2512,
friendly_name="Ministral 3 3B 2512",
providers=[
KilnModelProvider(
name=ModelProviderName.openrouter,
model_id="mistralai/ministral-3b-2512",
structured_output_mode=StructuredOutputMode.json_schema,
),
],
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find the StructuredOutputMode enum definition and adapter implementations
rg -n "class StructuredOutputMode" --type=py
fd -e py -path "*adapter*" | head -20

Repository: Kiln-AI/Kiln

Length of output: 330


🏁 Script executed:

# Check OpenRouter and Together AI adapter implementations
rg -n "openrouter|together_ai" --type=py libs/core/kiln_ai/adapters/ -l

Repository: Kiln-AI/Kiln

Length of output: 1101


🏁 Script executed:

# Look for how structured_output_mode is handled in adapters
rg -n "structured_output_mode" --type=py libs/core/kiln_ai/adapters/ -A 3

Repository: Kiln-AI/Kiln

Length of output: 50369


🏁 Script executed:

# Get the StructuredOutputMode enum definition
cat -n libs/core/kiln_ai/datamodel/datamodel_enums.py | head -50

Repository: Kiln-AI/Kiln

Length of output: 2659


🏁 Script executed:

# Find OpenRouter and Together AI adapter implementations
find libs/core/kiln_ai/adapters -name "*openrouter*" -o -name "*together*"

Repository: Kiln-AI/Kiln

Length of output: 176


🏁 Script executed:

# Search for how structured_output_mode is processed in adapters
rg "def.*structured_output_mode|json_schema|json_instruction_and_object" libs/core/kiln_ai/adapters/model_adapters/ -A 3 -B 1

Repository: Kiln-AI/Kiln

Length of output: 35341


🏁 Script executed:

# Check lines 2333-2390 and nearby to see what models are being added
sed -n '2333,2390p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 2113


🏁 Script executed:

# Check lines 2450-2462 mentioned in the comment
sed -n '2450,2462p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 487


🏁 Script executed:

# Search for provider-specific validation or mode handling
rg "openrouter|together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 2 -A 5 | head -100

Repository: Kiln-AI/Kiln

Length of output: 10068


🏁 Script executed:

# Check if there's any validation that restricts modes per provider
rg "structured_output_mode.*openrouter|openrouter.*structured_output_mode|together_ai.*structured_output_mode" -i

Repository: Kiln-AI/Kiln

Length of output: 38


🏁 Script executed:

# Look for any tests that verify OpenRouter or Together AI support
rg "openrouter|together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3

Repository: Kiln-AI/Kiln

Length of output: 2317


🏁 Script executed:

# Check for any special handling or documented restrictions for OpenRouter json_schema
rg "json_schema.*openrouter|openrouter.*json_schema" -i libs/core/

Repository: Kiln-AI/Kiln

Length of output: 257


🏁 Script executed:

# Look for Together AI specific mode handling
rg "together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 3 -A 3

Repository: Kiln-AI/Kiln

Length of output: 871


🏁 Script executed:

# Check the test suite for Together AI structured output mode testing
rg "together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3 | head -60

Repository: Kiln-AI/Kiln

Length of output: 309


🏁 Script executed:

# Check if there are any integration docs or known limitations
fd "README|readme|INTEGRATION|integration" libs/core/kiln_ai/adapters/ | xargs cat 2>/dev/null

Repository: Kiln-AI/Kiln

Length of output: 3157


🏁 Script executed:

# Find the context of the OpenRouter json_schema limitation comment
rg "json_schema.*sometimes works" -B 5 -A 5 libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 636


🏁 Script executed:

# Check if there are similar limitations for Together AI
rg "together_ai.*json_schema|json_instruction_and_object.*together" -B 3 -A 3 libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 38


🏁 Script executed:

# Look for any existing OpenRouter models to see what modes they use
rg "openrouter" libs/core/kiln_ai/adapters/ml_model_list.py | grep -i "structured_output_mode" | head -20

Repository: Kiln-AI/Kiln

Length of output: 38


🏁 Script executed:

# Check for any tests that validate OpenRouter or Together AI model configurations
rg "test.*openrouter|test.*together_ai" libs/core/kiln_ai/adapters/ -l | xargs grep -l "structured_output_mode"

Repository: Kiln-AI/Kiln

Length of output: 274


🏁 Script executed:

# Look for all OpenRouter models and their structured_output_mode settings
rg "ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 5 | grep -A 1 "openrouter" | grep "structured_output_mode"

Repository: Kiln-AI/Kiln

Length of output: 1132


🏁 Script executed:

# Check what modes are used across all OpenRouter models
rg "name=ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 8 | grep "structured_output_mode" | sort | uniq -c

Repository: Kiln-AI/Kiln

Length of output: 497


🏁 Script executed:

# Get the full context of that OpenRouter limitation comment
sed -n '1290,1310p' libs/core/kiln_ai/adapters/ml_model_list.py

Repository: Kiln-AI/Kiln

Length of output: 885


🏁 Script executed:

# Check which Mistral models currently exist with OpenRouter
rg "mistral.*openrouter|openrouter.*mistral" -i libs/core/kiln_ai/adapters/ml_model_list.py -B 2 -A 5

Repository: Kiln-AI/Kiln

Length of output: 38


Use json_instruction_and_object for OpenRouter Mistral/Ministral models to avoid known json_schema instability.

OpenRouter has a documented limitation where json_schema is unreliable ("JSON_schema sometimes works so force that, but not consistently so still not recommended" — see existing Llama 3.3 model config). The newly added OpenRouter models should follow the safer pattern already established in the codebase:

  • mistral_small_creative (OpenRouter) correctly uses json_instruction_and_object
  • ministral_3_14b_2512, ministral_3_8b_2512, ministral_3_3b_2512, and mistral_3_large_2512 (OpenRouter) should switch from json_schema to json_instruction_and_object to align with known provider stability
  • ministral_3_14b_2512 (Together AI) correctly uses json_instruction_and_object

The adapter explicitly supports json_instruction_and_object mode, which pairs JSON instructions in the prompt with API-level json_object response format—a pattern already used in 24 other OpenRouter model configurations.

🤖 Prompt for AI Agents
In @libs/core/kiln_ai/adapters/ml_model_list.py around lines 2333 - 2390, Update
the OpenRouter Mistral/Ministral model entries to use the stable structured
output mode: for KilnModel instances named ModelName.ministral_3_14b_2512,
ModelName.ministral_3_8b_2512, and ModelName.ministral_3_3b_2512, change the
KilnModelProvider entries that currently set StructuredOutputMode.json_schema to
StructuredOutputMode.json_instruction_and_object (leave the existing
mistral_small_creative and the Together AI provider for
ModelName.ministral_3_14b_2512 unchanged).

@leonardmq leonardmq merged commit 811194a into main Jan 11, 2026
12 checks passed
@leonardmq leonardmq deleted the leonard/add-models-0109 branch January 11, 2026 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants