Final Requirements Document: Model Routing & Proxy Service (OpenAI/Anthropic Compatible) + Admin Dashboard
Develop a proxy service to handle client requests from OpenAI or Anthropic. The system matches the request based on the user's requested model, provided specifications, and internal rules. Without modifying other fields in the request body, the system replaces only the model name in the request body with the target model, forwards the request to the configured upstream Provider/Vendor, and returns the result to the client. The system must support automatic retries and failover, log key request information and performance metrics to a database, and provide an admin dashboard for configuration and query.
- Transparent Proxy: Compatible with OpenAI/Anthropic client invocation methods, returning compatible format responses.
- Model Name Modification Only: Only allowed to modify the
modelfield during request forwarding; other content of the request body must remain unchanged. - Rule Engine Matching: Process all rules through a rule engine to output a set of available providers and their target models.
- Round Robin Selection: Use a Round Robin strategy to select the current node from the matched provider nodes.
- Reliable Retry & Failover: Execute retries on the same provider or switch to the next provider node based on status codes.
- Token Statistics: Count input/output tokens according to "Standard Large Model Interface Token Counting Methods" and log them.
- Full Observability: Record detailed request logs, including time, model, provider, retries, latency, tokens, request/response, errors, etc.
- Configurable & Manageable: Provide a modern admin panel to support CRUD operations for Providers, Models/Rules, API Cases, and log queries/filtering.
- Multi-Database Engine: Support SQLite and PostgreSQL, defaulting to SQLite; abstract data access to facilitate switching.
- Engineering Quality: All code must have unit tests, and all unit tests must pass during execution.
- Access and proxy forwarding of OpenAI/Anthropic style requests (only modify model name).
- Rule Engine: Match based on context and output candidate providers and target models.
- Round Robin strategy for selecting provider nodes.
- Retry and failover logic.
- Token statistics and database logging.
- Database storage: Providers, Model Mappings/Rules, Strategies, API Cases (API Keys), Request Logs.
- Admin Panel: CRUD for Providers/Models/Rules/API Cases; Request log query and multi-condition filtering.
- No Rate Limiting.
- No Account-Level Control (Account/Tenant quotas, tiered permissions, usage control, etc.).
- Rewriting request/response content other than the
modelfield (Not done). - Permission systems (Admin background login/role permissions) are not defined in this requirement (can be added later).
- Requested Model: The model name carried in the client request body.
- Target Model: The model name replaced into the request body after system matching.
- Provider/Vendor: Upstream provider node, possessing interface address, protocol, and API type.
- Vendor Node: A callable provider node (usually corresponding to a provider configuration record).
- API Case: API Key entity used for authentication (containing
key_nameandkey_value/token). - Strategy: Currently only supports Round Robin.
- Rule Engine Context: The input data set used for rule matching.
- Backend: Python + FastAPI
- Proxy Interface (OpenAI/Anthropic compatible)
- Admin Interface (called by frontend panel)
- Rule Engine
- Provider Client (Multi-protocol/Multi-API adapter)
- Retry & Failover Handler
- Token Counter
- Log Recording & Sanitization Module
- Data Access Abstraction Layer (Repository/DAO) + Multi-DB Adapter (SQLite/PG)
- Frontend: Next.js + TypeScript
- Provider Management
- Model & Rule Management (including Provider-Target Model differentiated configuration)
- API Case Management
- Request Log View & Multi-condition Filtering
- Client initiates request (OpenAI/Anthropic style).
- Backend authentication (API Case token), obtains
api_key_id. - Parse request body and headers, extract
requested_model. - Calculate input Tokens (Standard Token counting method).
- Rule engine processes all rules, outputs candidate provider set based on context (each provider corresponds to a target model).
- Round Robin strategy selects the current provider node from candidate providers.
- Only replace the model field in the request body with the target model corresponding to that provider.
- Forward request to that provider; execute retries on the same provider or switch to the next provider node according to failure strategy.
- Return final response to client.
- Calculate output Tokens, record request logs (sanitize sensitive fields before storage).
- FR-REQ-1: Support receiving OpenAI or Anthropic client requests (compatible with their invocation methods).
- FR-REQ-2: Keep request headers and body unchanged during forwarding, only allowing modification of the model name field.
- FR-REQ-3: Forward requests to the upstream provider interface according to configuration and return the response to the client.
- FR-MDL-1: Get
requested_modelfrom request body. - FR-MDL-2: Determine
target_model(bound to provider) based on rules and strategy. - FR-MDL-3: Only replace the
modelfield in the request body, do not modify other fields (messages/tools/temperature/max_tokens, etc., remain consistent).
- FR-RULE-1: Rule engine must "process all rules" (full evaluation) for every request and output matching results.
- FR-RULE-2: Rule engine context must include:
current_model: Current requested model (requested_model)headers: Request headers (structured object)request_body: Request body (structured object)token_usage: Current Token consumption (at least includes this input Token; extensible)
- FR-RULE-3: Rule engine output must include:
- List of candidate provider nodes (based on matching results)
- Target model corresponding to each candidate provider (Because: the same requested model maps to different target models under different providers)
- Meta-information required for strategy selection (such as priority, weight, availability, etc., extensible)
- FR-STR-1: System must implement strategy mechanism; currently only supports Round Robin.
- FR-STR-2: Round Robin rotates selection among candidate provider nodes.
- FR-STR-3: Round Robin state must be concurrency-safe (can be implemented via DB/Cache/Atomic Counter, specific implementation decided by design).
- FR-RT-1: When upstream response status code ≥ 500:
- Retry on same provider
- Interval 1000ms each time
- Max 3 retries
- FR-RT-2: If 3 retries on the same provider still fail, switch to the next matched provider node and continue trying.
- FR-RT-3: When upstream response status code < 500:
- Do not retry on the same provider
- Directly switch to the next matched provider node and try
- FR-RT-4: When all candidate providers fail, the system should return a failure result to the client (suggest returning the status and error information of the last failure; specific response encapsulation can be defined uniformly during implementation).
- FR-TOK-1: Input Token: Use "Standard Large Model Interface Token Counting Method" to count user requests.
- FR-TOK-2: Output Token: Count output Tokens for upstream responses.
- FR-TOK-3: Input/Output Tokens must be written to request logs.
Note: Token counting method needs to be consistent with the target protocol (OpenAI/Anthropic); implementation can use pluggable counters, selecting counting method by endpoint/protocol.
- FR-LOG-1: All requests must record detailed logs, fields at least include:
- Request time (
request_time) - API Case:
api_key_id/api_key_name requested_model,target_model- Provider:
provider_id/provider_name retry_countfirst_byte_delay(TTFB)total_timeinput_tokens,output_tokensrequest_headers(structured, after sanitization)request_body(structured)response_statusresponse_bodyerror_info(Error information: structured or text)
- Request time (
- FR-LOG-2: Logs must support multi-condition filtering queries (for frontend log page use).
- FR-SEC-LOG-1:
request_headersrecorded in logs must sanitize/mask theauthorizationfield before storage.- Example requirement: Keep field name, mask the value part (e.g.,
Bearer *****or only keep first/last few characters, mask the middle).
- Example requirement: Keep field name, mask the value part (e.g.,
- FR-SEC-LOG-2: Sanitization should be handled uniformly before storage, ensuring plain text authorization is not saved in the database.
- FR-KEY-1: System provides API Case table, containing:
- API key name
- API key value (token)
- FR-KEY-2: API key value generated by random algorithm.
- FR-KEY-3: The ID of the API Case used for each request must be logged (
api_key_id).
- FR-DB-1: Support SQLite and PostgreSQL (PG) storage engines.
- FR-DB-2: Default to SQLite.
- FR-DB-3: Switch database engine via configuration (e.g., environment variable or configuration file).
- FR-DB-4: Database access must be abstracted (Repository/DAO Interface), business logic must not couple with specific database implementation.
- FR-DB-5: Common data access patterns, transaction management, pagination queries, etc., should be consolidated in a common package to avoid duplicate implementation.
8. Data Models (Recommended Structure, Supports "Same Model + Different Provider = Different Target Model")
Specific field types subject to implementation; suggest unifying designs compatible with SQLite/PG and maintaining via migration tools.
id(PK)namebase_url(Interface Address)protocol(Protocol/Compatibility Type)api_type/api_nameis_active(Suggested)created_at/updated_at(Suggested)
requested_model(PK)strategy(Currently fixed: Round Robin)matching_rules(Model layer rules, optional; format defined by rule engine)capabilities/functionality(Optional)created_at/updated_at(Suggested)
8.3 Model-Provider Mapping Table: model_mapping_providers (Key: Each provider can have different target_model)
id(PK)requested_model(FK -> model_mappings.requested_model)provider_id(FK -> service_providers.id)target_model_name(Target model name corresponding to this provider)provider_rules(Optional: Provider-level rules for finer-grained control; format same as rule engine definition)priority(Optional)weight(Optional)is_active(Suggested)created_at/updated_at(Suggested)
id(PK)key_name(unique)key_value(randomly generated token)is_active(Suggested)created_at/last_used_at(Suggested)
id(PK)request_timeapi_key_id(FK -> api_keys.id)api_key_name(Redundant allowed)requested_modeltarget_modelprovider_id(FK -> service_providers.id)retry_countfirst_byte_delay_mstotal_time_msinput_tokensoutput_tokensrequest_headers(JSON / JSONB; Sanitized)request_body(JSON / JSONB)response_statusresponse_body(JSON/TEXT)error_info(JSON/TEXT)trace_id(Suggested)
- Core interfaces compatible with OpenAI/Anthropic (Specific path set determined by implementation).
- Behavior: Auth -> Rule Match -> Round Robin Provider Selection -> Replace Model -> Forward -> Retry/Switch -> Return -> Log.
/admin/providers: Provider CRUD/admin/models: Model Mapping CRUD (including rule fields)/admin/model-providers: Model-Provider Mapping CRUD (requested_model + provider_id + target_model + provider_rules)/admin/api-keys: API Case CRUD/admin/logs: Log Query (Pagination + Multi-condition Filtering)/admin/logs/{id}: Log Detail
- FE-UI-1: Modern design style (clear hierarchy, unified spacing, responsive layout, good interaction feedback).
- FE-UI-2: Common component consolidation: Table, Form, Modal, Pagination, Filter, JSON Display/Edit components, etc., avoiding duplicate code.
- FE-UI-3: List pages support pagination, sorting, search; operations require confirmation and result notification.
- List: Display ID, name, base_url, protocol, api_type/api_name, status, update time, etc.
- Add/Edit: Form validation (Required, URL format, etc.).
- Delete: Double confirmation; Prompt if referenced (specific constraints decided by backend).
- Model Mapping (
model_mappings) CRUD:requested_model, strategy (Round Robin), model layer rules (if enabled), functionality description, etc.
- Model-Provider Mapping (
model_mapping_providers) CRUD (Key):- Under the same
requested_model, can configure multiple providers - Each provider can configure a different
target_model_name - Can configure
provider_rules(if enabled) to support finer matching logic - Can configure priority/weight (Optional)
- Under the same
- Rule Editor:
- Needs to support rule setting and validation, rules can reference context:
model,headers,request_body,token_usage. - Suggested Form (Implementation Selection):
- Structured Rule Editor (Preferred, reduces manual errors)
- JSON Rule Editor (Fallback)
- Suggest evaluating open-source components: Monaco Editor (JSON + Schema validation), or JSON Schema-based form editor, etc. (Selected during implementation phase)
- Needs to support rule setting and validation, rules can reference context:
- List default sorted by time descending.
- Must support filtering conditions (at least):
- Time range (Start/End)
requested_model/target_model(Fuzzy)provider(Dropdown)response_status(Exact/Range, e.g., 2xx/4xx/5xx or >=500)- Has Error (
error_infois not empty) api_key_id/api_key_nameretry_count(=0 / >0)- Token range (Optional)
- Total time range (Optional)
- Log Detail: Display sanitized
headers, structuredrequest_body,response_body,error_info, and support copy/collapse.
- List: id, key_name, key_value (Hidden by default, copyable display strategy), status, create time, last used time, etc.
- Add: Input
key_name,key_valuegenerated randomly by backend and returned; provide copy entry after successful creation. - Edit: Allow modifying name/status (Whether to support resetting key_value can be an extension).
- Delete: Double confirmation.
- NFR-ARCH-1: Adopt clear layering, avoid writing business logic or SQL directly in the route layer.
- NFR-ARCH-2: Business logic (Service) depends on Repository Interface rather than concrete database implementation.
- NFR-ARCH-3: Common capabilities extracted to common package (
common) to avoid duplicate code:- Retrier, HTTP Client Wrapper, Timer, Token Counter, Sanitizer, Error Wrapper, Config Loading, etc.
- NFR-ARCH-4: Module responsibility single, naming standard, testable, extensible.
app/api/(Route Layer)services/(Business Orchestration: Match/RoundRobin/Forward/Retry/Log)rules/(Rule Engine: Rule Definition, Context, Executor)providers/(Upstream Adapter: openai-like / anthropic-like)repositories/(Repository Interface)repositories/sqlalchemy/(SQLite/PG Implementation)db/(Connection, Session, Migration)domain/(Domain Model/DTO)common/(Common Capabilities)tests/
- NFR-TEST-1: All code must have unit tests.
- NFR-TEST-2: All unit tests must pass during execution/delivery (Failure blocks).
- NFR-TEST-3: Key coverage scope (at least):
- Rule Engine: Matching logic with context containing headers/request_body/token_usage/model
- Model Replacement: Only modify model, do not modify other fields
- Round Robin Strategy: Multi-node rotation correctness and concurrency consistency (Select testable scheme by design)
- Retry/Switch: ≥500 Same provider 1000ms * 3; < 500 Direct switch; Node exhaustion behavior
- Provider Forwarding: Request pass-through, Response pass-through, Error handling
- Token Counting: Input/Output statistics and logging
- Repository: Basic read/write and consistency under SQLite (default); Contract testing for extensible PG adapter
- Sanitization: Verification of authorization field masking before storage
- NFR-TEST-4: External dependencies injectable/Mockable (DB, Upstream HTTP, Time, Random Number), ensuring stable and repeatable tests.
- Proxy link available: Request access, Rule matching, Round Robin selection, Only replace model, Forward, Retry/Switch per rules, Return response.
- Rule engine context includes: model, headers, request_body, token_usage; and can output candidate providers and their
target_model. - Log full storage: Fields complete; authorization sanitized; Queryable by conditions.
- Database support: Default SQLite runnable; Switch to PG without changing business code (Only config switch + implementation layer adapter).
- Admin Panel available: Provider CRUD, Model/Rule/Provider Target Model Configuration CRUD, API Case CRUD, Log Query & Multi-condition Filtering.
- Unit test coverage meets requirements, and all unit tests pass during project execution.
- Clarify the specific endpoint list and field difference handling strategies for OpenAI/Anthropic compatibility in the first phase (Maintain "Only Change Model" principle).
- Define rule format (JSON/DSL) and validation mechanism, and synchronize frontend rule editor selection (Structured preferred + JSON editor fallback).
- Determine Multi-DB technical scheme (e.g., Unified ORM/Migration Tool), implement Repository interface and implementation separation.
- Establish test baseline: Test framework, Mock standards, Contract test templates, and CI gate process.