You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Fix compat models not registering in LiteLLM by resolving mapped provider/model
- Make _build_litellm_params async to query database for compat model mappings
- Add session parameter to push_model_to_litellm and update_model_in_litellm
- Compat models now correctly resolve to their mapped provider's api_base and model string
- Set docker-compose.yml to use frontend.api:create_app instead of legacy web
- Clean up legacy documentation from CLAUDE.md
- Remove obsolete planning documents
Fixes issue where code-davinci-002, gpt-4-code, gpt-4-turbo-vision, and
gpt-4o-vision compat models were not appearing in LiteLLM after push.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@@ -4,22 +4,21 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
4
5
5
## Project Overview
6
6
7
-
LiteLLM Updater now runs as two FastAPI services built from shared code:
7
+
LiteLLM Updater runs as two FastAPI services built from shared code:
8
8
-`backend/` → headless sync worker (`backend/sync_worker.py`) that fetches provider models and can push them into LiteLLM on a schedule.
9
9
-`frontend/` → UI + API (`frontend/api.py`) for manual fetch/push/sync and CRUD over providers/models.
10
10
11
11
Data lives in `./data/models.db` (SQLite) mounted into both services. Docker Compose also brings up LiteLLM (`http://localhost:4000`, API key `sk-1234` by default) and the UI on `http://localhost:4001`.
12
12
13
13
## Terminology (keep consistent in UI + API)
14
-
- Fetch: only pull models from providers into the local database.
15
-
- Push: send database models to LiteLLM (deduped, no new fetch).
16
-
- Sync: fetch + push in one operation.
14
+
-**Fetch**: only pull models from providers into the local database.
15
+
-**Push**: send database models to LiteLLM (deduped, no new fetch).
16
+
-**Sync**: fetch + push in one operation.
17
17
18
-
## Current layout
18
+
## Current Layout
19
19
-`shared/`: database models/CRUD, source fetchers, normalization, tags, and config helpers shared by both services.
20
20
-`backend/`: provider sync pipeline, LiteLLM client, and the scheduler entrypoint (`python -m backend.sync_worker`).
21
21
-`frontend/`: FastAPI UI, templates, and the provider/model routes that call into `backend.provider_sync`.
22
-
-`litellm_updater/`: legacy entrypoint kept for compatibility; most logic now lives under `shared/` + `backend/`.
-`model-updater-web`: UI/API on `http://localhost:4001`.
83
+
-`model-updater-backend`: sync worker (no HTTP). Runs `python -m backend.sync_worker`.
84
+
-`model-updater-web`: UI/API on `http://localhost:4001`. Runs `frontend.api:create_app` via uvicorn.
78
85
-`litellm`: target proxy on `http://localhost:4000` (`Authorization: Bearer sk-1234`).
79
86
-`db`: Postgres backing LiteLLM.
80
87
-`watchtower`: optional image updater (labelled).
81
88
89
+
**IMPORTANT:** The `model-updater-web` service MUST use `command: uvicorn frontend.api:create_app --factory --host 0.0.0.0 --port 8000` in docker-compose.yml to run the correct application.
- Fetch = load models from providers into the database, Push = register existing DB models into LiteLLM, Sync = fetch + push. UI buttons and routes follow this naming (`/api/providers/fetch-all`, `/api/providers/sync-all`, per-provider Fetch/Sync/Push).
106
+
## Operational Notes
107
+
- Fetch = load models from providers into the database, Push = register existing DB models into LiteLLM, Sync = fetch + push.
108
+
- UI buttons and routes follow this naming: `/api/providers/fetch-all`, `/api/providers/sync-all`, per-provider Fetch/Sync/Push.
99
109
- LiteLLM pushes dedupe by lowercasing `unique_id` and pruning duplicates before registration; per-provider Push and Push All avoid re-adding existing models.
100
110
- Ollama details: by default only `/api/tags` is fetched. Set `FETCH_OLLAMA_DETAILS=true` to pull `/api/show` per model; heavy fields (tensors/modelfile/license/etc.) are stripped before storing to keep memory usage low.
> The flows below describe the legacy `litellm_updater` entrypoint. The Docker services now route through `backend/provider_sync.py` and `frontend/api.py`, but the database behaviors (upserts, orphan handling, effective params) remain the same.
188
+
**Compatibility Models API:**
189
+
-`GET /api/compat/models` - List all compat models
190
+
-`POST /api/compat/models` - Create new compat model mapping
191
+
-`PUT /api/compat/models/{id}` - Update compat model
192
+
-`DELETE /api/compat/models/{id}` - Delete compat model
193
+
-`POST /api/compat/register-defaults` - Register default OpenAI model mappings
183
194
184
195
### Key Data Flow
185
196
186
197
**Initial Setup:**
187
198
1. User adds providers in `/admin` (stored in database)
# Use UI at /sources to refresh, edit, and push models
488
+
# 4. Test via UI
489
+
open http://localhost:4001/sources
462
490
```
463
491
464
492
## Recent Changes & Gotchas
493
+
- The **frontend service must run `frontend.api:create_app`** (not the legacy `litellm_updater.web`). This is configured via `command:` in docker-compose.yml.
465
494
- Ollama `/api/tags` responses that return a bare list (instead of `{ "models": [...] }`) are now parsed correctly; this fixes empty syncs from some servers.
466
495
-`mode:*` tags are only generated for Ollama providers. OpenAI/compat providers should no longer get `mode:ollama` attached to their models.
467
496
- Duplicate detection when pushing to LiteLLM now reads tags from both `litellm_params` and `model_info`, so older LiteLLM entries without top-level tags are still de-duped.
468
-
- The Providers page uses a **Fetch** button that runs even if sync is disabled for that provider; the sync flag only controls scheduled syncs.
497
+
- The Providers page shows **Fetch**, **Sync**, and **Push** buttons for each provider. The sync flag only controls scheduled syncs from the backend worker.
469
498
- On the Admin page, adding and editing providers happen in modals; the inline add form is gone.
499
+
- Compat models page loads models from all available providers (filtered to exclude type='compat'), not just a hardcoded provider.
0 commit comments