Conversation
…ndices - [Error fixed] isdigit() rejected negative string indices like "-1"; use try/except int() - Add Case 1: when both role and index are set, filter by role first then apply index within that subset (e.g., last assistant message) This seems to be the obvious desired behavior if someone sets both 'message type' and 'message index'. - Add unit tests in tests/litellm/integrations/ covering both fixes (12 tests)
Fixes #23757 When the Anthropic pass-through endpoint converts messages to OpenAI format, content fields can be lists of content blocks rather than plain strings. map_system_message_pt assumed string content and crashed with TypeError on concatenation. Add _get_content_as_str() helper that normalizes both str and list content to a string before merging, using the existing convert_content_list_to_str utility. Tests: 3 new test cases covering list content, mixed str/list, and list content as last message.
When content is None (e.g. assistant messages with only tool_calls), the function previously fell through to str(None) producing the literal string 'None'. Now returns empty string for None content. Added test case for this scenario.
…o_handler for model "anthropic/deepseek-reasoner" (#1) When using model: anthropic/deepseek-reasoner in config.yaml (using deepseek's Anthropic compatible API), `litellm/llms/anthropic/chat/handler.py` is called but it assumes to receive a valid json but Deepseek returns "data: DONE" thus a JSONDecodeError is triggered.
…ing space Address review feedback: - Shallow-copy next_m before mutating to avoid side-effects on caller's dict - Use ' '.join(filter(None, ...)) to prevent trailing space when next_text is empty
- Return empty string instead of str(content) for unexpected types in _get_content_as_str - Add role assertion in test_supports_system_message_none_content - Apply black formatting to factory.py and test_optional_params.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Adding pricing and model data for Bedrock Z.AI GLM 5 model. * Adding pricing and model data for Bedrock Z.AI GLM 5 model. @greptileai * Adding pricing and model data for Bedrock Z.AI GLM 5 model. @greptileai
#24112) * fix(otel): use completion_tokens_details for Chat Completions API reasoning tokens The OTEL callback looked up output_tokens_details which only exists on ResponseAPIUsage. Chat Completions API uses completion_tokens_details. Also add prompt_tokens_details extraction for cached_tokens. Fixes #23990 * fix: use getattr instead of .get() for typed token details objects CompletionTokensDetailsWrapper and PromptTokensDetailsWrapper don't have a .get() method unlike OutputTokensDetails. Use getattr() to safely access attributes on both dict-like and typed objects. Addresses Greptile review feedback. * style: run black formatter on _utils.py * test: add tests for completion_tokens_details and prompt_tokens_details extraction Adds three tests to verify _set_usage_outputs correctly handles: - Chat Completions API: completion_tokens_details (reasoning_tokens) and prompt_tokens_details (cached_tokens) - Responses API: fallback to output_tokens_details for reasoning_tokens - Basic usage: no token details present, no spurious attributes set Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: RheagalFire <arishalam121@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…n across pods (#24246) * fix(key_rotation): add distributed lock to prevent concurrent rotation across pods * address greptile review: refresh redis_cache on late init, document lock TTL constraint * Apply suggestion from @greptile-apps[bot] Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * style: run black 23.x formatter on changed files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Aarish Alam <arishalam121@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…24312) The Tab label for 'Price Data Reload' was already guarded with all_admin_roles.includes(userRole), but the corresponding TabPanel was rendered unconditionally, causing the PriceDataManagementTab component to mount for all users. This component polls admin-only routes every ~30s: - GET /schedule/model_cost_map_reload/status - GET /model/cost_map/source For internal_user sessions, each poll produced a 401 ERROR log. Align the TabPanel render condition with the existing Tab guard. Fixes #24308
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Greptile SummaryThis staging PR bundles several independent fixes and features: configurable CORS settings via Key changes:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/proxy_cli.py | Adds early CORS configuration loading from config.yaml before proxy_server is imported; includes _normalize_cors_value, _apply_cors_settings_from_general_settings, _process_config_includes, and _resolve_os_environ_refs helpers. Include-file merge precedence (scalar overwrite) is undocumented. asyncio.run() inside _load_general_settings_for_early_cors for GCS path noted in prior threads. |
| litellm/proxy/proxy_server.py | Replaces hardcoded origins=["*"] / allow_credentials=True with env-var-driven CORS configuration; fixes a silent bug in _check_for_os_environ_vars where list items were never actually updated (reassigned loop variable was discarded). The _check_for_os_environ_vars fix is a behaviour change for existing deployments using os.environ/ refs inside YAML lists. |
| litellm/proxy/common_utils/callback_utils.py | Eagerly instantiates known custom-logger callbacks (e.g. "otel") at startup via _add_custom_logger_callback_to_specific_event, then still appends the string to imported_list/litellm.callbacks. The dual-registration pattern (instance in success/failure lists + string in callbacks) risks double-registration if any code path re-processes the string on hot-reload or async model load. |
| litellm/proxy/common_utils/key_rotation_manager.py | Adds optional distributed-lock support to KeyRotationManager.process_rotations() via PodLockManager; lock is acquired before rotation and released in a finally block. Lock TTL caveat is documented. New tests cover acquire, skip-when-held, release-on-success, release-on-failure, and no-lock-manager paths. |
| litellm/litellm_core_utils/prompt_templates/factory.py | Fixes map_system_message_pt to handle list-typed message content (Anthropic pass-through format) and None content (tool-call messages) by introducing _get_content_as_str; avoids in-place mutation of the caller's message dicts by shallow-copying before modification. Remaining formatting-only changes are cosmetic (black/ruff reformatting). |
| litellm/integrations/anthropic_cache_control_hook.py | Adds a new Case 1 for combined role+index targeting (e.g. "last assistant message") with correct IndexError handling; existing index-only and role-only cases are preserved as Cases 2 and 3 respectively. |
| litellm/integrations/arize/_utils.py | Extends _set_usage_outputs to read completion_tokens_details/output_tokens_details and prompt_tokens_details/input_tokens_details for reasoning and cache-read token counts. Uses or short-circuit for field lookup — already flagged in prior thread; no new change needed here. |
| litellm/llms/anthropic/chat/handler.py | Wraps SSE chunk JSON parsing in a try/except to gracefully ignore non-JSON data lines (e.g. data: [DONE] from DeepSeek-compatible endpoints) instead of crashing. Safe and well-scoped fix. |
| tests/test_litellm/proxy/common_utils/test_callback_utils.py | Adds tests for early callback instantiation; test_initialize_callbacks_on_proxy_instantiates_otel creates a real OpenTelemetry instance without mocking in the mock-only tests/test_litellm/ folder, potentially violating the CI/CD mock-only rule (Rule 07fb23f7). |
| tests/proxy_unit_tests/test_cors_config.py | Comprehensive unit tests for all CORS helper functions using mocks, temp files, and monkeypatching; no real network calls. Good coverage of edge cases (empty list, env var precedence, invalid types). |
Sequence Diagram
sequenceDiagram
participant CLI as proxy_cli.py (run_server)
participant EarlyLoad as _load_general_settings_for_early_cors
participant Env as os.environ
participant Server as proxy_server.py (module import)
participant CORS as CORSMiddleware
CLI->>EarlyLoad: config file path / bucket name
EarlyLoad->>EarlyLoad: yaml.safe_load + _process_config_includes
EarlyLoad->>EarlyLoad: _resolve_os_environ_refs (os.environ/ refs)
EarlyLoad-->>CLI: general_settings dict
CLI->>CLI: _apply_cors_settings_from_general_settings(general_settings)
CLI->>CLI: _normalize_cors_value(cors_allow_origins)
CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_ORIGINS) [only if unset]
CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_CREDENTIALS)
CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_METHODS)
CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_HEADERS)
Note over CLI,Server: First import of proxy_server reads env vars at module level
CLI->>Server: import app (triggers module-level CORS init)
Server->>Env: os.getenv(LITELLM_CORS_ALLOW_ORIGINS)
Server->>Server: _get_cors_allow_list() → cors_allow_origins
Server->>Server: _get_cors_allow_credentials() → cors_allow_credentials
Server->>Server: wildcard + credentials guard check
Server->>CORS: app.add_middleware(CORSMiddleware, allow_origins=..., allow_credentials=...)
Reviews (2): Last reviewed commit: "Merge pull request #23782 from voidborne..." | Re-trigger Greptile
| if raw_value.strip() == "": | ||
| return False | ||
| return raw_value.strip().lower() in {"1", "true", "yes"} | ||
|
|
||
|
|
There was a problem hiding this comment.
Credentials default flips for users with explicit wildcard origins
_get_cors_allow_credentials returns not origins_were_configured, meaning that as soon as any value is set for LITELLM_CORS_ALLOW_ORIGINS — including "*" — allow_credentials silently flips from True to False unless the user also sets LITELLM_CORS_ALLOW_CREDENTIALS=true.
For a user who migrates to the new config and writes:
general_settings:
cors_allow_origins: "*"they will see allow_credentials=False (CORS credentialed requests blocked) even though their intent was "same as default". While wildcard + credentials is rejected by browsers and was technically a broken setting, the silent default-flip may surprise users who were relying on the permissive default and are now pinning the origin to * explicitly. The PR description and docs should call this out clearly, and users who set explicit origins via the new env vars should be encouraged to also set cors_allow_credentials explicitly.
Rule Used: What: avoid backwards-incompatible changes without... (source)
| if value is None: | ||
| return None | ||
| if isinstance(value, str): | ||
| return ",".join([item.strip() for item in value.split(",") if item.strip()]) | ||
| if isinstance(value, list): | ||
| if not all(isinstance(item, str) for item in value): | ||
| raise ValueError( | ||
| f"Invalid CORS setting for '{setting_name}': expected a list of strings." | ||
| ) | ||
| return ",".join([item.strip() for item in value if item.strip()]) | ||
| raise ValueError( |
There was a problem hiding this comment.
Empty list normalizes to
"" — sets env var, may surprise users
When a user passes cors_allow_origins: [], _normalize_cors_value returns "" (an empty string), and _set_env_var_if_unset writes that to LITELLM_CORS_ALLOW_ORIGINS. _get_cors_allow_list then interprets the empty string as [], which blocks all cross-origin traffic.
This is intentional (as confirmed by test_normalize_cors_value_empty_list), but the asymmetry with None (which leaves the env var unset and falls back to the wildcard default) is easy to misread. Consider adding an explicit note in the ConfigGeneralSettings field description explaining that an empty list explicitly disables all CORS (rather than falling back to the wildcard default).
| completion_tokens_details = usage.get("completion_tokens_details") or usage.get( | ||
| "output_tokens_details" | ||
| ) | ||
| if completion_tokens_details is not None: | ||
| reasoning_tokens = getattr(completion_tokens_details, "reasoning_tokens", None) |
There was a problem hiding this comment.
or short-circuits on falsy objects, may skip valid completion_tokens_details
completion_tokens_details = usage.get("completion_tokens_details") or usage.get(
"output_tokens_details"
)If completion_tokens_details is present but evaluates as falsy (e.g. a CompletionTokensDetailsWrapper() instance with all-zero fields where __bool__ returns False, or an empty dict {}), Python's or will silently fall through to output_tokens_details. This is unlikely in practice today, but a safer pattern would be an explicit None check:
| completion_tokens_details = usage.get("completion_tokens_details") or usage.get( | |
| "output_tokens_details" | |
| ) | |
| if completion_tokens_details is not None: | |
| reasoning_tokens = getattr(completion_tokens_details, "reasoning_tokens", None) | |
| completion_tokens_details = usage.get("completion_tokens_details") | |
| if completion_tokens_details is None: | |
| completion_tokens_details = usage.get("output_tokens_details") |
The same applies to prompt_tokens_details / input_tokens_details a few lines below.
| self.db_spend_update_writer.pod_lock_manager.redis_cache = redis_cache | ||
| from litellm.proxy.proxy_server import key_rotation_pod_lock_manager | ||
|
|
||
| if key_rotation_pod_lock_manager is not None: |
There was a problem hiding this comment.
Circular import risk: importing from
proxy_server inside a function in utils.py
from litellm.proxy.proxy_server import key_rotation_pod_lock_managerproxy_server.py already imports heavily from proxy/utils.py. Adding a reverse import — even inside a function body — creates a potential circular dependency that can silently resolve to a stale module reference at the time the function executes. Consider storing key_rotation_pod_lock_manager in a shared lightweight module (or passing it as a parameter to ProxyLogging) rather than pulling it back from proxy_server at call time.
| import. This early pass supports literal values, os.environ/ references, | ||
| and local file-path include files. Secret-manager-backed values and include | ||
| directives inside S3/GCS configs are only resolved later in startup and | ||
| therefore cannot affect the initial middleware config. | ||
| """ | ||
| config: Optional[dict] = None | ||
|
|
||
| if os.environ.get("LITELLM_CONFIG_BUCKET_NAME") is not None: | ||
| import asyncio | ||
|
|
||
| from litellm.proxy.common_utils.load_config_utils import ( | ||
| get_config_file_contents_from_gcs, |
There was a problem hiding this comment.
asyncio.run() fails inside a running event loop
The GCS bucket path calls asyncio.run(get_config_file_contents_from_gcs(...)). This raises RuntimeError if an event loop is already running (e.g. in pytest-asyncio or if startup is ever refactored to be async). The S3 path is synchronous and unaffected.
Consider using nest_asyncio or restructuring this call to use a sync GCS wrapper already available in the codebase, so the function behaves consistently across all config sources.
|
|
||
| Regression test for: when store_model_in_db=true, OTEL callback was | ||
| never instantiated because initialize_callbacks_on_proxy() only added | ||
| the string "otel" to litellm.callbacks without creating the instance. | ||
| """ | ||
| import litellm | ||
| from litellm.proxy.common_utils.callback_utils import ( | ||
| initialize_callbacks_on_proxy, | ||
| ) | ||
| from litellm.integrations.opentelemetry import OpenTelemetry | ||
| from litellm.litellm_core_utils import litellm_logging | ||
| from litellm.proxy import proxy_server | ||
|
|
||
| # Save original state | ||
| original_callbacks = litellm.callbacks[:] | ||
| original_success = litellm.success_callback[:] | ||
| original_async_success = litellm._async_success_callback[:] | ||
| original_failure = litellm.failure_callback[:] | ||
| original_async_failure = litellm._async_failure_callback[:] | ||
| original_otel_logger = getattr(proxy_server, "open_telemetry_logger", None) | ||
| original_in_memory_loggers = litellm_logging._in_memory_loggers[:] | ||
|
|
||
| try: | ||
| # Clear callbacks | ||
| litellm.callbacks = [] | ||
| litellm.success_callback = [] | ||
| litellm._async_success_callback = [] | ||
| litellm.failure_callback = [] | ||
| litellm._async_failure_callback = [] | ||
|
|
||
| initialize_callbacks_on_proxy( | ||
| value=["otel"], | ||
| premium_user=False, | ||
| config_file_path="", | ||
| litellm_settings={}, | ||
| ) | ||
|
|
||
| # Verify an OpenTelemetry instance exists in success callbacks | ||
| otel_in_success = any( | ||
| isinstance(cb, OpenTelemetry) | ||
| for cb in litellm.success_callback + litellm._async_success_callback | ||
| ) | ||
| assert otel_in_success, ( | ||
| "OpenTelemetry instance not found in success callbacks. " | ||
| f"success_callback={litellm.success_callback}, " | ||
| f"_async_success_callback={litellm._async_success_callback}" | ||
| ) | ||
|
|
||
| # Verify an OpenTelemetry instance exists in failure callbacks | ||
| otel_in_failure = any( | ||
| isinstance(cb, OpenTelemetry) | ||
| for cb in litellm.failure_callback + litellm._async_failure_callback | ||
| ) | ||
| assert otel_in_failure, "OpenTelemetry instance not found in failure callbacks." | ||
|
|
||
| # Verify open_telemetry_logger is set on proxy_server | ||
| assert proxy_server.open_telemetry_logger is not None, ( | ||
| "proxy_server.open_telemetry_logger should be set after " |
There was a problem hiding this comment.
Real OpenTelemetry instance created in mock-only test folder
test_initialize_callbacks_on_proxy_instantiates_otel calls initialize_callbacks_on_proxy(value=["otel"], ...) without mocking _add_custom_logger_callback_to_specific_event or the OpenTelemetry class itself. This causes a real OpenTelemetry instance to be constructed inside tests/test_litellm/, which is supposed to contain only mock-based tests to ensure reliable CI/CD execution.
While the OpenTelemetry.__init__ in LiteLLM doesn't typically make network connections immediately, it does set up real span processors and exporters that may have initialization-time side effects (e.g., spawning background threads, attempting OTLP endpoint connections if OTEL_EXPORTER_OTLP_ENDPOINT is set in the environment). This can cause flaky failures in CI if any OTEL env vars are set.
The complementary test test_initialize_callbacks_on_proxy_calls_instantiation_for_known_callbacks (lines 154–176) correctly mocks this path. Consider aligning the first test to the same pattern:
@patch("litellm.utils._add_custom_logger_callback_to_specific_event")
def test_initialize_callbacks_on_proxy_instantiates_otel(mock_add_callback):
from litellm.proxy.common_utils.callback_utils import initialize_callbacks_on_proxy
initialize_callbacks_on_proxy(
value=["otel"],
premium_user=False,
config_file_path="",
litellm_settings={},
)
# Verify early-instantiation path is taken
assert mock_add_callback.call_count == 2
mock_add_callback.assert_any_call("otel", "success")
mock_add_callback.assert_any_call("otel", "failure")Rule Used: What: prevent any tests from being added here that... (source)
| isinstance(callback, str) | ||
| and callback in litellm._known_custom_logger_compatible_callbacks | ||
| ): | ||
| # Eagerly instantiate the callback class (e.g. OpenTelemetry) | ||
| # so that proxy-level globals like open_telemetry_logger are | ||
| # set at startup. Without this, store_model_in_db=true causes | ||
| # the async model-loading path to skip callback instantiation | ||
| # entirely, leaving the logger as None. | ||
| # | ||
| # Each event is registered independently so a failure in one | ||
| # does not leave the callback half-registered. | ||
| from litellm.utils import ( | ||
| _add_custom_logger_callback_to_specific_event, | ||
| ) | ||
|
|
||
| for event in ("success", "failure"): | ||
| try: | ||
| _add_custom_logger_callback_to_specific_event(callback, event) | ||
| except Exception as e: | ||
| verbose_proxy_logger.error( | ||
| f"Failed to initialize callback '{callback}' " | ||
| f"for {event} event at startup: {e}. " | ||
| "Check that the required environment variables " | ||
| "are set." | ||
| ) | ||
| # Always add to imported_list so litellm.callbacks stays in | ||
| # sync — it is read by health endpoints, hot-reload, and | ||
| # spend tracking. On success the instance is already in the | ||
| # success/failure lists; the string here keeps the canonical | ||
| # config list complete. | ||
| imported_list.append(callback) | ||
| elif isinstance(callback, str) and callback == "presidio": | ||
| from litellm.proxy.guardrails.guardrail_hooks.presidio import ( |
There was a problem hiding this comment.
Callback string in
litellm.callbacks may cause double-registration on hot-reload
The new code eagerly calls _add_custom_logger_callback_to_specific_event(callback, event) to add the live instance to success_callback/failure_callback, and then also appends the raw string (e.g. "otel") to imported_list. imported_list later becomes litellm.callbacks.
If any downstream code path (e.g., hot-reload, store_model_in_db async load, or spend-tracking startup) re-processes litellm.callbacks and calls _add_custom_logger_callback_to_specific_event for each string entry it finds, the callback instance will be registered a second time — leading to duplicate log events or double-counting in spend tracking.
The comment explains the intent ("the string keeps the canonical config list complete"), but without an explicit guard that prevents the string from being re-instantiated on subsequent passes, this is fragile. Consider either:
- Replacing the string in
imported_listwith the already-created instance (solitellm.callbackscontains the object, not the string), or - Adding a dedup check in
_add_custom_logger_callback_to_specific_eventthat skips registration when an instance of the same class already exists in the target list.
| config=value, depth=depth + 1, max_depth=max_depth | ||
| ) | ||
| elif isinstance(value, list): | ||
| resolved_list = [] | ||
| for item in value: | ||
| if isinstance(item, dict): | ||
| item = self._check_for_os_environ_vars( | ||
| config=item, depth=depth + 1, max_depth=max_depth | ||
| resolved_list.append( | ||
| self._check_for_os_environ_vars( | ||
| config=item, depth=depth + 1, max_depth=max_depth | ||
| ) | ||
| ) | ||
| elif isinstance(item, str) and item.startswith("os.environ/"): | ||
| resolved_list.append(get_secret(item)) | ||
| else: | ||
| resolved_list.append(item) | ||
| config[key] = resolved_list | ||
| # if the value is a string and starts with "os.environ/" - then it's an environment variable |
There was a problem hiding this comment.
Silent bug fix in
_check_for_os_environ_vars — original list items were never mutated
The original code reassigned the local loop variable item without writing it back to the list:
# BEFORE (broken): item reassignment is lost
for item in value:
if isinstance(item, dict):
item = self._check_for_os_environ_vars(config=item, ...)
# config[key] is never updated — all resolved dict items are droppedThe new code correctly accumulates results into resolved_list and reassigns config[key]. Additionally, os.environ/ references inside list string elements are now resolved (they were silently skipped before).
This is a meaningful correctness fix, but it represents a behaviour change for existing deployments that have lists of strings or dicts with os.environ/ references in their config. Any YAML such as:
model_list:
- os.environ/MY_MODEL_ALIASwill now be resolved where it was previously a no-op. Make sure this change is called out clearly in release notes so users can verify their configs still behave as expected.
|
|
||
| del config["include"] | ||
| return config | ||
|
|
||
|
|
||
| def _resolve_os_environ_refs(config: dict) -> dict: | ||
| for key, value in config.items(): | ||
| if isinstance(value, dict): | ||
| config[key] = _resolve_os_environ_refs(value) | ||
| elif isinstance(value, list): | ||
| resolved_list = [] | ||
| for item in value: | ||
| if isinstance(item, dict): | ||
| resolved_list.append(_resolve_os_environ_refs(item)) | ||
| elif isinstance(item, str) and item.startswith("os.environ/"): | ||
| from litellm import get_secret_str | ||
|
|
There was a problem hiding this comment.
_process_config_includes does not resolve os.environ/ refs in included files before merging
_process_config_includes merges the raw YAML from included files into the main config dict, and then _resolve_os_environ_refs is called on the merged result. This ordering is fine for the current scope.
However, included files are loaded with yaml.safe_load and their values are merged directly. If an included file contains a top-level key that already exists in the main config as a non-list (e.g., a string), the included value silently overwrites the main config value (line 169: config[key] = value). There is no warning to the operator. This mirrors the behaviour of ProxyConfig._load_and_merge_include_configs but is worth documenting via an inline comment so future maintainers understand the intentional merge precedence (included file wins for scalar values, lists are extended).
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes