-
Notifications
You must be signed in to change notification settings - Fork 338
add models (Minimax M2.1, Bytedance, Mistral) #926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR expands the ML model catalog by adding 8 new model variants to the ModelName enum (Mistral small creative, Ministral 3.x series, Minimax M2.1, ByteDance Seed 1.6 variants) and extends provider configurations for existing models with Cerebras GLM support, additional Fireworks AI integration, and OpenRouter/SiliconFlow coverage. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the platform's AI model offerings by integrating several new models and providers. It enhances the system's capabilities by adding support for various Mistral 3 series models, the Minimax M2.1, and ByteDance Seed models, alongside new provider options for existing GLM models. This broadens the range of available language models for users, providing more diverse options for different use cases. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for several new models from providers like Minimax, Bytedance, Mistral, and Cerebras. The changes are mostly additions to the model list and look good overall. I've pointed out a minor naming inconsistency for one of the new Mistral models to ensure clarity and alignment with the provider's naming.
| phi_4_5p6b = "phi_4_5p6b" | ||
| phi_4_mini = "phi_4_mini" | ||
| mistral_large = "mistral_large" | ||
| mistral_3_large_2512 = "mistral_3_large_2512" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an inconsistency in this model's naming. On OpenRouter, the model ID is mistralai/mistral-large-2512 and it's named "Mistral Large 2512", without the "3". For consistency with the provider's naming, it's better to remove the _3_ from the enum member name.
| mistral_3_large_2512 = "mistral_3_large_2512" | |
| mistral_large_2512 = "mistral_large_2512" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disagree with gemini, it is called Mistral Large 3 https://openrouter.ai/mistralai/mistral-large-2512
| name=ModelName.mistral_3_large_2512, | ||
| friendly_name="Mistral Large 3 2512", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To align with the suggested change in the ModelName enum and the provider's naming ("Mistral Large 2512"), the _3_ and 3 should be removed from name and friendly_name respectively.
| name=ModelName.mistral_3_large_2512, | |
| friendly_name="Mistral Large 3 2512", | |
| name=ModelName.mistral_large_2512, | |
| friendly_name="Mistral Large 2512", |
📊 Coverage ReportOverall Coverage: 91% Diff: origin/main...HEADNo lines with coverage information in this diff.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @libs/core/kiln_ai/adapters/ml_model_list.py:
- Around line 2333-2390: Update the OpenRouter Mistral/Ministral model entries
to use the stable structured output mode: for KilnModel instances named
ModelName.ministral_3_14b_2512, ModelName.ministral_3_8b_2512, and
ModelName.ministral_3_3b_2512, change the KilnModelProvider entries that
currently set StructuredOutputMode.json_schema to
StructuredOutputMode.json_instruction_and_object (leave the existing
mistral_small_creative and the Together AI provider for
ModelName.ministral_3_14b_2512 unchanged).
🧹 Nitpick comments (1)
libs/core/kiln_ai/adapters/ml_model_list.py (1)
96-105: Enum additions look fine; keep naming consistent and validate downstream references.
The newModelName.*entries are straightforward. Minor nit:mistral_small_creativedoesn’t need the multiline/parenthesized assignment—consider keeping enum values single-line for consistency.Also applies to: 219-223
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
libs/core/kiln_ai/adapters/ml_model_list.py
🧰 Additional context used
📓 Path-based instructions (2)
libs/core/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app
Use Python 3.10+ for the core library (libs/core)
Files:
libs/core/kiln_ai/adapters/ml_model_list.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
**/*.py: Use Pydantic v2, not v1
Use asyncio for asynchronous operations in Python
Files:
libs/core/kiln_ai/adapters/ml_model_list.py
🧠 Learnings (10)
📓 Common learnings
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/upload_file_dialog.svelte:107-120
Timestamp: 2025-09-10T08:32:18.688Z
Learning: leonardmq prefers to work within the constraints of their SDK codegen for API calls, even when typing is awkward (like casting FormData to match expected types), rather than using alternative approaches like native fetch that would cause compiler errors with their generated types.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/datamodel/chunk.py:138-160
Timestamp: 2025-08-22T11:17:56.862Z
Learning: leonardmq prefers to avoid redundant validation checks when upstream systems already guarantee preconditions are met. He trusts the attachment system to ensure paths are properly formatted and prefers letting the attachment method (resolve_path) handle any edge cases rather than adding defensive precondition checks.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 390
File: libs/core/kiln_ai/adapters/provider_tools.py:494-499
Timestamp: 2025-07-23T08:58:45.769Z
Learning: leonardmq prefers to keep tightly coupled implementation details (like API versions) hardcoded when they are not user-facing and could break other modules if changed. The shared Config class is reserved for user-facing customizable values, not internal implementation details.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 341
File: libs/server/kiln_server/document_api.py:44-51
Timestamp: 2025-06-18T08:22:58.510Z
Learning: leonardmq prefers to defer fixing blocking I/O in async handlers when: the operation is very fast (milliseconds), user-triggered rather than automated, has no concurrency concerns, and would require additional testing to fix properly. He acknowledges such issues as valid but makes pragmatic decisions about timing the fixes.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/datamodel/rag.py:33-35
Timestamp: 2025-09-04T06:45:44.212Z
Learning: leonardmq requires vector_store_config_id to be a mandatory field in RagConfig (similar to extractor_config_id, chunker_config_id, embedding_config_id) for consistency. He prefers fixing dependent code that breaks due to missing required fields rather than making fields optional to accommodate incomplete data.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 654
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:141-152
Timestamp: 2025-09-25T06:38:14.854Z
Learning: leonardmq prefers simple onMount initialization patterns over reactive statements when possible, and is cautious about maintaining internal state for idempotency in Svelte components. He values simplicity and safety in component lifecycle management.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 402
File: libs/core/kiln_ai/adapters/embedding/litellm_embedding_adapter.py:0-0
Timestamp: 2025-07-14T03:43:07.283Z
Learning: leonardmq prefers to keep defensive validation checks even when they're technically redundant, viewing them as useful "quick sanity checks" that provide additional safety nets. He values defensive programming over strict DRY (Don't Repeat Yourself) principles when the redundant code serves as a safeguard.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/lib/api_schema.d.ts:0-0
Timestamp: 2025-09-10T08:27:47.227Z
Learning: leonardmq correctly identifies auto-generated files and prefers not to manually edit them. When suggesting changes to generated TypeScript API schema files, the fix should be made in the source OpenAPI specification in the backend, not in the generated TypeScript file.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:136-150
Timestamp: 2025-09-04T06:34:12.318Z
Learning: leonardmq prefers brute force re-insertion of all chunks when partial chunks exist in the LanceDB vector store, rather than selectively inserting only missing chunks. His reasoning is that documents typically have only dozens of chunks (making overwriting cheap), and partial insertion likely indicates data corruption or conflicts that should be resolved by re-inserting all chunks to ensure consistency.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:98-103
Timestamp: 2025-09-04T07:08:04.248Z
Learning: leonardmq prefers to let TableNotFoundError bubble up in delete_nodes_by_document_id operations rather than catching it, as this error indicates operations are being done in the wrong order on the caller side and should be surfaced as a programming error rather than handled gracefully.
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:19:20.074Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-10-28T05:25:09.283Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 757
File: libs/core/kiln_ai/adapters/reranker_list.py:39-42
Timestamp: 2025-10-28T05:25:09.283Z
Learning: In the Kiln codebase (Python), model family and model name fields in reranker and similar model definition classes (like KilnRerankerModel in libs/core/kiln_ai/adapters/reranker_list.py) should use plain str types rather than Enum types. This is because users receive over-the-air (OTA) updates for new models, and using Enums would break when clients with older builds encounter new model families their Enum definitions don't include. Strings provide forward compatibility for the OTA update mechanism.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:15:20.796Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-09-10T08:06:53.717Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 585
File: libs/core/kiln_ai/adapters/ml_embedding_model_list.py:32-33
Timestamp: 2025-09-10T08:06:53.717Z
Learning: In the Kiln AI codebase, friendly_name fields in embedding model configurations should maintain the original model naming conventions from providers (e.g., "gemini-embedding-001", "text-embedding-3-small") for human readability, rather than following snake_case conventions like the enum values.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-07-16T09:37:39.816Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:0-0
Timestamp: 2025-07-16T09:37:39.816Z
Learning: The `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due to output formatting issues: output was duplicated in both `output` and `reasoning` fields, and contained random internal JSON in the output. This model should not be re-added without addressing these formatting problems.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-22T11:15:53.584Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/adapters/embedding/embedding_registry.py:14-20
Timestamp: 2025-08-22T11:15:53.584Z
Learning: In the Kiln AI codebase, model_provider_name fields in datamodels like EmbeddingConfig are typed as ModelProviderName enum (which inherits from str, Enum), not as plain strings. Therefore, accessing .value on these fields is valid and returns the string representation of the enum value.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Generate Coverage Report
🔇 Additional comments (3)
libs/core/kiln_ai/adapters/ml_model_list.py (3)
5702-5709: Verify whetherjson_schemais supported foraccounts/fireworks/models/minimax-m2.The codebase shows inconsistent Fireworks configurations: explicit comments indicate Fireworks doesn't support
json_schemafor models likellama-v3p1-70b-instruct(which usefunction_calling_weak), yet other Fireworks models likellama4-maverick-instruct-basicare configured withStructuredOutputMode.json_schema. Ifminimax-m2cannot actually supportjson_schemaon Fireworks, this configuration will break structured output for that provider path.
5652-5676: Minimax M2.1 parser requires validation—copy from M2 should be test-backed.The OpenRouter configuration correctly uses
parser=ModelParserID.r1_thinking(provider-specific, not applied to Fireworks), matching the M2 pattern. However, per established learnings, parser selection should rest on observed responses via tests, not model similarity alone. Validate that OpenRouter's minimax M2.1 actually returns R1-style<thinking>blocks before relying on this parser.
5255-5260: Cerebras GLM providers: verifymodel_idformat and JSON schema support against Cerebras API.The configuration references
model_id="zai-glm-4.7"andmodel_id="zai-glm-4.6"withStructuredOutputMode.json_schema, but these need validation. Unlike other providers (e.g., SiliconFlow uses"zai-org/GLM-4.7"), Cerebras model IDs use a different format. Confirm:
- Whether Cerebras API accepts these exact model ID strings
- Whether Cerebras supports JSON schema structured output mode
- Whether
reasoning_capable=Truebehaves as expected on CerebrasAlso applies to: lines 5280-5285
| # Mistral Small Creative | ||
| KilnModel( | ||
| family=ModelFamily.mistral, | ||
| name=ModelName.mistral_small_creative, | ||
| friendly_name="Mistral Small Creative", | ||
| providers=[ | ||
| KilnModelProvider( | ||
| name=ModelProviderName.openrouter, | ||
| model_id="mistralai/mistral-small-creative", | ||
| structured_output_mode=StructuredOutputMode.json_instruction_and_object, | ||
| ), | ||
| ], | ||
| ), | ||
| # Ministral 3 14B 2512 | ||
| KilnModel( | ||
| family=ModelFamily.mistral, | ||
| name=ModelName.ministral_3_14b_2512, | ||
| friendly_name="Ministral 3 14B 2512", | ||
| providers=[ | ||
| KilnModelProvider( | ||
| name=ModelProviderName.openrouter, | ||
| model_id="mistralai/ministral-14b-2512", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| ), | ||
| KilnModelProvider( | ||
| name=ModelProviderName.together_ai, | ||
| model_id="mistralai/Ministral-3-14B-Instruct-2512", | ||
| structured_output_mode=StructuredOutputMode.json_instruction_and_object, | ||
| supports_function_calling=False, | ||
| ), | ||
| ], | ||
| ), | ||
| # Ministral 3 8B 2512 | ||
| KilnModel( | ||
| family=ModelFamily.mistral, | ||
| name=ModelName.ministral_3_8b_2512, | ||
| friendly_name="Ministral 3 8B 2512", | ||
| providers=[ | ||
| KilnModelProvider( | ||
| name=ModelProviderName.openrouter, | ||
| model_id="mistralai/ministral-8b-2512", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| ), | ||
| ], | ||
| ), | ||
| # Ministral 3 3B 2512 | ||
| KilnModel( | ||
| family=ModelFamily.mistral, | ||
| name=ModelName.ministral_3_3b_2512, | ||
| friendly_name="Ministral 3 3B 2512", | ||
| providers=[ | ||
| KilnModelProvider( | ||
| name=ModelProviderName.openrouter, | ||
| model_id="mistralai/ministral-3b-2512", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| ), | ||
| ], | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find the StructuredOutputMode enum definition and adapter implementations
rg -n "class StructuredOutputMode" --type=py
fd -e py -path "*adapter*" | head -20Repository: Kiln-AI/Kiln
Length of output: 330
🏁 Script executed:
# Check OpenRouter and Together AI adapter implementations
rg -n "openrouter|together_ai" --type=py libs/core/kiln_ai/adapters/ -lRepository: Kiln-AI/Kiln
Length of output: 1101
🏁 Script executed:
# Look for how structured_output_mode is handled in adapters
rg -n "structured_output_mode" --type=py libs/core/kiln_ai/adapters/ -A 3Repository: Kiln-AI/Kiln
Length of output: 50369
🏁 Script executed:
# Get the StructuredOutputMode enum definition
cat -n libs/core/kiln_ai/datamodel/datamodel_enums.py | head -50Repository: Kiln-AI/Kiln
Length of output: 2659
🏁 Script executed:
# Find OpenRouter and Together AI adapter implementations
find libs/core/kiln_ai/adapters -name "*openrouter*" -o -name "*together*"Repository: Kiln-AI/Kiln
Length of output: 176
🏁 Script executed:
# Search for how structured_output_mode is processed in adapters
rg "def.*structured_output_mode|json_schema|json_instruction_and_object" libs/core/kiln_ai/adapters/model_adapters/ -A 3 -B 1Repository: Kiln-AI/Kiln
Length of output: 35341
🏁 Script executed:
# Check lines 2333-2390 and nearby to see what models are being added
sed -n '2333,2390p' libs/core/kiln_ai/adapters/ml_model_list.pyRepository: Kiln-AI/Kiln
Length of output: 2113
🏁 Script executed:
# Check lines 2450-2462 mentioned in the comment
sed -n '2450,2462p' libs/core/kiln_ai/adapters/ml_model_list.pyRepository: Kiln-AI/Kiln
Length of output: 487
🏁 Script executed:
# Search for provider-specific validation or mode handling
rg "openrouter|together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 2 -A 5 | head -100Repository: Kiln-AI/Kiln
Length of output: 10068
🏁 Script executed:
# Check if there's any validation that restricts modes per provider
rg "structured_output_mode.*openrouter|openrouter.*structured_output_mode|together_ai.*structured_output_mode" -iRepository: Kiln-AI/Kiln
Length of output: 38
🏁 Script executed:
# Look for any tests that verify OpenRouter or Together AI support
rg "openrouter|together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3Repository: Kiln-AI/Kiln
Length of output: 2317
🏁 Script executed:
# Check for any special handling or documented restrictions for OpenRouter json_schema
rg "json_schema.*openrouter|openrouter.*json_schema" -i libs/core/Repository: Kiln-AI/Kiln
Length of output: 257
🏁 Script executed:
# Look for Together AI specific mode handling
rg "together_ai" libs/core/kiln_ai/adapters/model_adapters/ -B 3 -A 3Repository: Kiln-AI/Kiln
Length of output: 871
🏁 Script executed:
# Check the test suite for Together AI structured output mode testing
rg "together_ai" libs/core/kiln_ai/adapters/test_ml_model_list.py -B 3 -A 3 | head -60Repository: Kiln-AI/Kiln
Length of output: 309
🏁 Script executed:
# Check if there are any integration docs or known limitations
fd "README|readme|INTEGRATION|integration" libs/core/kiln_ai/adapters/ | xargs cat 2>/dev/nullRepository: Kiln-AI/Kiln
Length of output: 3157
🏁 Script executed:
# Find the context of the OpenRouter json_schema limitation comment
rg "json_schema.*sometimes works" -B 5 -A 5 libs/core/kiln_ai/adapters/ml_model_list.pyRepository: Kiln-AI/Kiln
Length of output: 636
🏁 Script executed:
# Check if there are similar limitations for Together AI
rg "together_ai.*json_schema|json_instruction_and_object.*together" -B 3 -A 3 libs/core/kiln_ai/adapters/ml_model_list.pyRepository: Kiln-AI/Kiln
Length of output: 38
🏁 Script executed:
# Look for any existing OpenRouter models to see what modes they use
rg "openrouter" libs/core/kiln_ai/adapters/ml_model_list.py | grep -i "structured_output_mode" | head -20Repository: Kiln-AI/Kiln
Length of output: 38
🏁 Script executed:
# Check for any tests that validate OpenRouter or Together AI model configurations
rg "test.*openrouter|test.*together_ai" libs/core/kiln_ai/adapters/ -l | xargs grep -l "structured_output_mode"Repository: Kiln-AI/Kiln
Length of output: 274
🏁 Script executed:
# Look for all OpenRouter models and their structured_output_mode settings
rg "ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 5 | grep -A 1 "openrouter" | grep "structured_output_mode"Repository: Kiln-AI/Kiln
Length of output: 1132
🏁 Script executed:
# Check what modes are used across all OpenRouter models
rg "name=ModelProviderName.openrouter" libs/core/kiln_ai/adapters/ml_model_list.py -A 8 | grep "structured_output_mode" | sort | uniq -cRepository: Kiln-AI/Kiln
Length of output: 497
🏁 Script executed:
# Get the full context of that OpenRouter limitation comment
sed -n '1290,1310p' libs/core/kiln_ai/adapters/ml_model_list.pyRepository: Kiln-AI/Kiln
Length of output: 885
🏁 Script executed:
# Check which Mistral models currently exist with OpenRouter
rg "mistral.*openrouter|openrouter.*mistral" -i libs/core/kiln_ai/adapters/ml_model_list.py -B 2 -A 5Repository: Kiln-AI/Kiln
Length of output: 38
Use json_instruction_and_object for OpenRouter Mistral/Ministral models to avoid known json_schema instability.
OpenRouter has a documented limitation where json_schema is unreliable ("JSON_schema sometimes works so force that, but not consistently so still not recommended" — see existing Llama 3.3 model config). The newly added OpenRouter models should follow the safer pattern already established in the codebase:
mistral_small_creative(OpenRouter) correctly usesjson_instruction_and_object✓ministral_3_14b_2512,ministral_3_8b_2512,ministral_3_3b_2512, andmistral_3_large_2512(OpenRouter) should switch fromjson_schematojson_instruction_and_objectto align with known provider stabilityministral_3_14b_2512(Together AI) correctly usesjson_instruction_and_object✓
The adapter explicitly supports json_instruction_and_object mode, which pairs JSON instructions in the prompt with API-level json_object response format—a pattern already used in 24 other OpenRouter model configurations.
🤖 Prompt for AI Agents
In @libs/core/kiln_ai/adapters/ml_model_list.py around lines 2333 - 2390, Update
the OpenRouter Mistral/Ministral model entries to use the stable structured
output mode: for KilnModel instances named ModelName.ministral_3_14b_2512,
ModelName.ministral_3_8b_2512, and ModelName.ministral_3_3b_2512, change the
KilnModelProvider entries that currently set StructuredOutputMode.json_schema to
StructuredOutputMode.json_instruction_and_object (leave the existing
mistral_small_creative and the Together AI provider for
ModelName.ministral_3_14b_2512 unchanged).
What does this PR do?
Add models and providers:
Checklists
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.