Skip to content

feat: configurable auto-route via system configs#187

Merged
think-in-universe merged 3 commits intomainfrom
feat/configurable-auto-route
Feb 23, 2026
Merged

feat: configurable auto-route via system configs#187
think-in-universe merged 3 commits intomainfrom
feat/configurable-auto-route

Conversation

@henrypark133
Copy link
Contributor

Summary

  • Make the model: "auto" routing for chat completions configurable via system configs instead of hardcoded constants
  • Admins can change model and default parameters via PATCH /v1/admin/configs without code changes or redeployment
  • Follows up on PR #165 per Robert's comment to move auto-route settings into system configs

Behavior

Scenario Model Params
No config set GLM-5-FP8 (hardcoded fallback) All defaults injected (temp=1.0, top_p=0.95, max_tokens=4096)
Config: {"model": "X"} X None injected (upstream decides)
Config: {"model": "X", "temperature": 0.7} X Only temperature injected
Config with all fields Configured model All configured params injected

Example

curl -X PATCH /v1/admin/configs \
  -H "Authorization: Bearer <admin-token>" \
  -d '{"auto_route": {"model": "new-org/NewModel", "temperature": 0.7}}'

Changes

  • crates/services/src/system_configs/ports.rs — Added AutoRouteConfig struct with required model and optional temperature, top_p, max_tokens
  • crates/api/src/models.rs — Added auto_route to public/admin config responses and upsert request
  • crates/api/src/routes/admin.rs — Validation for auto_route fields (non-empty model, valid ranges)
  • crates/api/src/routes/api.rs — Read config from DB on auto-route, graceful fallback to constants on DB failure

No DB migration needed — JSONB storage with #[serde(default)].

Test plan

  • cargo build compiles
  • cargo clippy clean
  • cargo test --features test — unit tests pass (50/50), existing integration tests unaffected
  • Manual: PATCH /v1/admin/configs with auto_route, send model: "auto" request, verify configured model is used
  • Manual: verify fallback behavior when no auto_route config is set

🤖 Generated with Claude Code

Allow admins to change the model and default parameters for `model: "auto"`
chat completion requests via `PATCH /v1/admin/configs` without code changes
or redeployment. Falls back to hardcoded defaults (GLM-5-FP8) when no
config is set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @henrypark133, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by making the model: "auto" routing mechanism for chat completions fully configurable through system settings. Previously hardcoded, these parameters can now be managed by administrators via an API endpoint, allowing for dynamic adjustments to the default model and its associated parameters without requiring application redeployment. This change improves operational flexibility and aligns with best practices for system configuration management.

Highlights

  • Configurable Auto-Routing: The model: "auto" routing for chat completions is now configurable via system configs, replacing hardcoded constants.
  • Admin Control: Administrators can dynamically adjust the auto-route model and its default parameters (temperature, top_p, max_tokens) through the PATCH /v1/admin/configs API endpoint without requiring code changes or redeployment.
  • System Config Integration: Introduced a new AutoRouteConfig struct and integrated auto_route settings into the SystemConfigs structure across API models and service ports.
  • Robust Fallback: The system now attempts to load auto-route configurations from the database, gracefully falling back to hardcoded defaults if the configuration is not set or fails to load.
  • Input Validation: Added comprehensive validation for auto_route fields (non-empty model, valid ranges for temperature, top_p, and max_tokens) during system config updates.
Changelog
  • crates/api/src/models.rs
    • Imported AutoRouteConfig from services.
    • Added auto_route field to PublicSystemConfigsResponse, SystemConfigsResponse, and UpsertSystemConfigsRequest structs.
    • Updated From and TryFrom implementations to include the new auto_route field.
  • crates/api/src/routes/admin.rs
    • Implemented validation logic for the auto_route configuration within the upsert_system_configs function, ensuring the model is not empty and parameters like temperature, top_p, and max_tokens are within valid ranges.
  • crates/api/src/routes/api.rs
    • Modified the prepare_chat_completions_body function to fetch auto_route settings from the system configurations service when model: "auto" is requested.
    • Applied the configured model and parameters, with a fallback to existing hardcoded defaults if no auto_route configuration is found or an error occurs during retrieval.
    • Updated the comment for the hardcoded auto-routing constants to reflect their new role as fallback defaults.
  • crates/services/src/system_configs/ports.rs
    • Defined a new AutoRouteConfig struct to encapsulate the target model and optional parameters (temperature, top_p, max_tokens) for auto-routing.
    • Added an auto_route field to the SystemConfigs and PartialSystemConfigs structs.
    • Updated the Default implementation for SystemConfigs to include auto_route: None.
    • Modified the merge method of SystemConfigs to correctly handle the merging of auto_route configurations.
Activity
  • This pull request is a follow-up to PR feat: route model "auto" to zai-org/GLM-5-FP8 #165, addressing a comment from Robert to move auto-route settings into system configurations.
  • The project successfully compiles (cargo build), passes lint checks (cargo clippy), and all unit tests are passing (cargo test --features test).
  • Manual testing for PATCH /v1/admin/configs and verification of fallback behavior is pending completion.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link

claude bot commented Feb 23, 2026

PR Review: feat: configurable auto-route via system configs

The approach is clean and the fallback behavior is well-designed. Two critical issues before merge:


1. ⚠️ DB query on every auto-route request — no caching

SystemConfigsServiceImpl::get_configs() has no caching layer — it calls the repository directly on every invocation (crates/services/src/system_configs/service.rs:20-24). This PR now calls it inline for every model: "auto" request:

// crates/api/src/routes/api.rs
let auto_config = state
    .system_configs_service
    .get_configs()   // ← uncached DB round-trip on every auto-route request
    .await
    ...

Previously the auto-route path used compile-time constants with zero overhead. Now every auto-route request adds a synchronous DB round-trip. Consider one of:

  • Adding a TTL cache in AppState (analogous to model_settings_cache / near_balance_cache already present in the file) with something like MODEL_SETTINGS_CACHE_TTL_SECS = 60
  • Or caching inside the service implementation

2. ⚠️ tracing::info! floods production logs on every auto-route request

get_configs() unconditionally logs at info level (crates/services/src/system_configs/service.rs:21):

async fn get_configs(&self) -> anyhow::Result<Option<SystemConfigs>> {
    tracing::info!("Getting system configs");  // ← fires on every auto-route request now
    self.repository.get_configs().await
}

Previously this was only triggered on admin/config endpoints. With this PR it fires for every model: "auto" chat completion. Production runs at info level per CLAUDE.md — this will produce high-volume noise in logs and makes it harder to identify real events.

The log in api.rs (tracing::info!("Auto-routing model: user_id={}", ...)) is fine; the issue is the service-level info log that fires on every get_configs() call. Drop it to debug or remove it.


Minor (non-blocking)

  • Cannot unset auto_route via API once set: partial.auto_route.or(self.auto_route) means sending {"auto_route": null} has no effect — admins need direct DB access to revert to hardcoded defaults. This is consistent with other config fields but worth documenting.
  • No upper bound on temperature: validation only checks >= 0.0. Most providers cap at 2.0; values like 100.0 will be accepted and likely fail downstream.

⚠️ Issues found

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces configurable auto-routing for chat completions, allowing administrators to define the target model and default parameters via system configurations. However, a security vulnerability exists where the endpoint for updating the auto-route configuration does not validate if the provided model ID actually exists, potentially leading to a denial of service for the auto-routing feature. Furthermore, there is a performance concern due to frequent database access for system configurations on every chat completion request, and the validation for the auto-route model identifier in the admin route should be strengthened to ensure the model exists in the proxy service.

henrypark133 and others added 2 commits February 22, 2026 23:48
- Add SystemConfigsCache (RwLock<Option<CacheEntry>>) with 60s TTL
- Auto-route reads from cache instead of hitting DB per request
- Admin config updates invalidate cache for immediate effect
- Downgrade service-level get_configs log from info to debug

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@think-in-universe think-in-universe merged commit abe4b72 into main Feb 23, 2026
3 checks passed
@think-in-universe think-in-universe deleted the feat/configurable-auto-route branch February 23, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants