Conversation
- Removed litellm/model_prices_and_context_window_backup.json - Updated get_model_cost_map() to read from single source - Refactored _load_local_model_cost_map() to support: * Package resources (production/pip install) * Project root (development) - Updated CI/CD workflows: * .circleci/config.yml: Copy to litellm/ before publishing * .github/workflows/simple_pypi_publish.yml: Same approach * ci_cd/check_files_match.py: Updated file paths - Updated test_get_model_file.py to use project root - Renamed test_get_backup_model_cost_map → test_get_local_model_cost_map Benefits: - Eliminates file duplication - Single source of truth for model pricing - No need for PRs like #16460 to sync backup files - Simpler maintenance The backup was unnecessary because CI copies the file to the package before publishing.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
The hook kept the backup JSON in sync with the root file — no longer needed now that the backup file has been removed.
- Update load_local_model_cost_map to use project root fallback for dev - Keep main's validation, aliases, and source info tracking - Remove backup JSON (purpose of this PR)
Prevent accidental commits of the CI-generated copy.
…or/remove-backup-file-dry-principle # Conflicts: # litellm/model_prices_and_context_window_backup.json
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Remove early return in get_llm_provider_logic.py that prevented the 'openrouter/' prefix from being stripped. The early return was intended for 'native OpenRouter models' like 'openrouter/free', but no such models exist in the model registry — all OpenRouter models are multi-segment (e.g. 'openrouter/anthropic/claude-3.5-sonnet') and need the prefix stripped before being sent to the OpenRouter API. This regression was introduced in v1.82.3 and caused 400 Bad Request errors for all OpenRouter models.
Native models like openrouter/auto, openrouter/free have no extra '/' after the prefix. Provider-routed models like openrouter/anthropic/claude have extra segments and need prefix stripping. Condition: only early-return when model after 'openrouter/' has no '/'.
fix: strip 'openrouter/' prefix from model names (#24234)
…24209) When multiple inputs were passed to the Gemini embedding endpoint and any contained multimodal data (images, audio, etc.), LiteLLM incorrectly used the `embedContent` endpoint which combines all inputs into a single aggregated embedding. Now uses `batchEmbedContents` with each input as a separate request, returning N embeddings for N inputs as expected. Also fixes hardcoded index=0 in batch embedding responses.
Add zai.glm-5 to model_prices_and_context_window.json with Bedrock Converse routing. Pricing: $1.00/1M input, $3.20/1M output tokens.
feat(bedrock): add Z.AI GLM-5 model support
…beddings-24209 fix(gemini): return separate embeddings for multimodal inputs
…e-backup-file-dry-principle # Conflicts: # litellm/model_prices_and_context_window_backup.json
…y-principle refactor: Remove redundant backup file
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis staging PR bundles three independent improvements: (1) a rename of the internal model-prices JSON from Key changes:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py | Adds combined multimodal embedding support (nested list syntax) for Gemini batch path; removes multimodal routing to embedContent for gemini/ provider — a backwards-incompatible behavior change. Contains one unused import (_is_multimodal_input). |
| litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py | Refactors part-building logic into _build_part_for_input, adds nested-list support for combined embeddings in transform_openai_input_gemini_content, fixes index tracking in process_response (was always 0), and improves token counting for multimodal inputs. Minor docstring inaccuracy in _is_multimodal_input. |
| litellm/litellm_core_utils/get_model_cost_map.py | Renames backup file references to use the canonical model_prices_and_context_window.json, adds a fallback to the project root for development environments. Silent FileNotFoundError handling may obscure misconfigured installs. |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Fixes GH#24234: OpenRouter provider-routed models like openrouter/anthropic/claude-3.5-sonnet now correctly strip the openrouter/ prefix, while single-segment native models like openrouter/auto preserve it. |
| litellm/types/llms/openai.py | Extends EmbeddingInput type to support nested lists (List[Union[str, List[str]]]) for combined multimodal embeddings — a backwards-compatible type broadening. |
| pyproject.toml | Adds litellm/model_prices_and_context_window.json to the wheel and sdist package includes so it is bundled during pip install. |
| tests/test_litellm/llms/vertex_ai/gemini_embeddings/test_batch_embed_content_transformation.py | Comprehensive mock-only unit tests covering nested list embeddings, multimodal input transformation, index regression fix, and token counting. All tests are mock-based with no network calls. |
| tests/test_litellm/test_deepseek_model_metadata.py | Updated to use GetModelCostMap.load_local_model_cost_map() instead of loading the now-deleted backup JSON directly. Test coverage and assertions are unchanged. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[embedding call] --> B{provider?}
B -->|vertex_ai| C[use_embed_content = True\nembedContent path]
B -->|gemini provider| D[use_embed_content = False\nbatchEmbedContents path]
D --> E{input shape?}
E -->|flat string or list| F[One request per element\nN separate embeddings]
E -->|nested list element| G[One request with multiple parts\n1 combined embedding]
E -->|mixed nested and flat| H[Combined plus separate requests]
D --> I{has file refs?}
I -->|yes and api_key present| J[resolve file references]
I -->|yes but no api_key| K[raise ValueError]
J --> F
J --> G
C --> L[transform_openai_input_gemini_embed_content]
F --> M[transform_openai_input_gemini_content]
G --> M
H --> M
Comments Outside Diff (4)
-
litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py, line 26 (link)Unused import after refactor
_is_multimodal_inputis imported here but no longer used anywhere in this file after removing theis_multimodal = _is_multimodal_input(input)call (which was replaced byuse_embed_content = custom_llm_provider == "vertex_ai"). This is an unused import that should be removed. -
litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py, line 124-126 (link)Inaccurate docstring — nested text-only lists return False
The docstring states the function returns
Trueif any element is "a nested list", but the implementation returnsTrueonly if a nested list contains multimodal elements. A nested list of plain-text strings (e.g.,[["text_a", "text_b"]]) returnsFalse, as confirmed by thetest_nested_text_is_not_multimodaltest. -
litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py, line 428 (link)Backwards-incompatible behavior change for Gemini multimodal users
Before this PR, multimodal input (data URIs, GCS URLs, file references) to a
gemini/provider would route throughuse_embed_content = True(theembedContentpath), producing a single combined embedding for all inputs. After this change,use_embed_contentisFalsefor all non-vertex_aiproviders, so multimodal input viagemini/now goes throughbatchEmbedContents, producing separate embeddings per input.Any existing user who passed multiple multimodal inputs with a
gemini/provider and expected a single combined embedding will silently get multiple separate embeddings instead, which is a breaking behavioral change with no opt-out flag.Per the project's rule on avoiding backwards-incompatible changes without user-controlled flags, this change should either be gated behind a flag or documented with a migration note.
Rule Used: What: avoid backwards-incompatible changes without... (source)
-
litellm/litellm_core_utils/get_model_cost_map.py, line 53-63 (link)Silent fallback on
FileNotFoundErrormay hide misconfigured installsWhen
importlib.resourcesraisesFileNotFoundError(e.g., a pip install where the JSON was not bundled correctly), the code silently falls through to the project-root path. If that path also fails (e.g., in a Docker container or any environment where the repo root is not present), an unhandledFileNotFoundErroris raised with no diagnostic message, making it hard to understand why litellm failed to start.Consider adding a warning log on the
FileNotFoundErrorpath (similar to theModuleNotFoundErrorpath) and wrapping the project-rootopen()in atry/exceptwith a clear error message:except FileNotFoundError: verbose_logger.warning( "LiteLLM: model_prices_and_context_window.json not found in package resources. " "Falling back to project root." )
Last reviewed commit: "refactor: extract _f..."
Allows wrapping multiple inputs in a nested list to produce a single combined embedding (text + image = 1 vector). Flat lists continue to produce separate embeddings per input (OpenAI-compatible default). Examples: input=["text", "image"] → 2 separate embeddings input=[["text", "image"]] → 1 combined embedding input=[["text", "image"], "x"] → 2 embeddings (1 combined + 1 separate)
…l-embeddings feat(gemini): support combined multimodal embeddings via nested input
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes