Skip to content

Litellm oss staging 03 21 2026#24351

Open
RheagalFire wants to merge 37 commits intomainfrom
litellm_oss_staging_03_21_2026
Open

Litellm oss staging 03 21 2026#24351
RheagalFire wants to merge 37 commits intomainfrom
litellm_oss_staging_03_21_2026

Conversation

@RheagalFire
Copy link
Collaborator

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

muneerusman25 and others added 30 commits February 11, 2026 15:26
…ndices

- [Error fixed] isdigit() rejected negative string indices like "-1"; use try/except int()
- Add Case 1: when both role and index are set, filter by role first then apply index within that subset (e.g., last assistant message)
  This seems to be the obvious desired behavior if someone sets both 'message type' and 'message index'.
- Add unit tests in tests/litellm/integrations/ covering both fixes (12 tests)
Fixes #23757

When the Anthropic pass-through endpoint converts messages to OpenAI
format, content fields can be lists of content blocks rather than
plain strings. map_system_message_pt assumed string content and
crashed with TypeError on concatenation.

Add _get_content_as_str() helper that normalizes both str and list
content to a string before merging, using the existing
convert_content_list_to_str utility.

Tests: 3 new test cases covering list content, mixed str/list, and
list content as last message.
When content is None (e.g. assistant messages with only tool_calls),
the function previously fell through to str(None) producing the
literal string 'None'. Now returns empty string for None content.

Added test case for this scenario.
…o_handler for model "anthropic/deepseek-reasoner" (#1)

When using model: anthropic/deepseek-reasoner in config.yaml (using deepseek's Anthropic compatible API),
`litellm/llms/anthropic/chat/handler.py` is called but it assumes to receive a valid json but Deepseek returns "data: DONE" thus a JSONDecodeError is triggered.
…ing space

Address review feedback:
- Shallow-copy next_m before mutating to avoid side-effects on caller's dict
- Use ' '.join(filter(None, ...)) to prevent trailing space when next_text is empty
- Return empty string instead of str(content) for unexpected types in _get_content_as_str
- Add role assertion in test_supports_system_message_none_content
- Apply black formatting to factory.py and test_optional_params.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
krrishdholakia and others added 6 commits March 21, 2026 20:11
* Adding pricing and model data for Bedrock Z.AI GLM 5 model.

* Adding pricing and model data for Bedrock Z.AI GLM 5 model.
@greptileai

* Adding pricing and model data for Bedrock Z.AI GLM 5 model.
@greptileai
#24112)

* fix(otel): use completion_tokens_details for Chat Completions API reasoning tokens

The OTEL callback looked up output_tokens_details which only exists on
ResponseAPIUsage. Chat Completions API uses completion_tokens_details.
Also add prompt_tokens_details extraction for cached_tokens.

Fixes #23990

* fix: use getattr instead of .get() for typed token details objects

CompletionTokensDetailsWrapper and PromptTokensDetailsWrapper don't
have a .get() method unlike OutputTokensDetails. Use getattr() to
safely access attributes on both dict-like and typed objects.

Addresses Greptile review feedback.

* style: run black formatter on _utils.py

* test: add tests for completion_tokens_details and prompt_tokens_details extraction

Adds three tests to verify _set_usage_outputs correctly handles:
- Chat Completions API: completion_tokens_details (reasoning_tokens) and
  prompt_tokens_details (cached_tokens)
- Responses API: fallback to output_tokens_details for reasoning_tokens
- Basic usage: no token details present, no spurious attributes set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: RheagalFire <arishalam121@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…n across pods (#24246)

* fix(key_rotation): add distributed lock to prevent concurrent rotation across pods

* address greptile review: refresh redis_cache on late init, document lock TTL constraint

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* style: run black 23.x formatter on changed files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Aarish Alam <arishalam121@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
)

* docs(minimax): update model descriptions and add high-speed variants in documentation

* fix(minimax): update input costs and add configuration for MiniMax high-speed models

* revert minimax lighting model prices

* revert minimax model prices of lighting in json
…24312)

The Tab label for 'Price Data Reload' was already guarded with
all_admin_roles.includes(userRole), but the corresponding TabPanel
was rendered unconditionally, causing the PriceDataManagementTab
component to mount for all users.

This component polls admin-only routes every ~30s:
- GET /schedule/model_cost_map_reload/status
- GET /model/cost_map/source

For internal_user sessions, each poll produced a 401 ERROR log.
Align the TabPanel render condition with the existing Tab guard.

Fixes #24308
@RheagalFire RheagalFire requested a review from Sameerlite March 22, 2026 10:35
@vercel
Copy link

vercel bot commented Mar 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 22, 2026 3:09pm

Request Review

@CLAassistant
Copy link

CLAassistant commented Mar 22, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
9 out of 11 committers have signed the CLA.

✅ muneerusman25
✅ krrishdholakia
✅ emerzon
✅ FBIKKIBF
✅ michelligabriele
✅ xykong
✅ bugparty
✅ Harshit28j
✅ AtharvaJaiswal005
❌ voidborne-d
❌ cmbaatz
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 22, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_oss_staging_03_21_2026 (c49ef4d) with main (c89496f)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 22, 2026

Greptile Summary

This staging PR bundles several independent fixes and features: configurable CORS settings via general_settings in config.yaml or env vars, distributed locking for key rotation across pods, eager instantiation of known custom-logger callbacks at startup, bug fixes for map_system_message_pt with Anthropic-format list content, graceful handling of non-JSON SSE chunks in the Anthropic handler, updated MiniMax model names, Arize integration improvements for token detail fields, and UI role-gating for the price data management tab.

Key changes:

  • CORS configuration (proxy_cli.py, proxy_server.py, _types.py): general_settings.cors_allow_origins/credentials/methods/headers are now read early (before proxy_server is imported) and written to env vars; proxy_server picks them up at module level to configure CORSMiddleware. Env vars always take precedence over config values.
  • _check_for_os_environ_vars silent bug fix (proxy_server.py): List items were reassigned to a local variable but never written back to the config dict — os.environ/ refs inside YAML lists were silently skipped. This is fixed and is a behaviour change for existing configs.
  • Distributed key-rotation lock (key_rotation_manager.py): PodLockManager is now threaded through KeyRotationManager to prevent concurrent rotation across pods; lock is always released in a finally block.
  • Eager callback instantiation (callback_utils.py): Known callbacks (e.g. "otel") are instantiated at startup via _add_custom_logger_callback_to_specific_event; the raw string is still appended to litellm.callbacks for bookkeeping, which could risk double-registration on hot-reload paths.
  • map_system_message_pt fix (factory.py): Adds _get_content_as_str to handle list/None message content; avoids mutating the caller's message dicts by shallow-copying before modification.
  • Non-JSON SSE chunk handling (handler.py): Wraps json.loads in a try/except to ignore [DONE] and other non-JSON lines without crashing.

Confidence Score: 3/5

  • PR is broadly safe but two P1 issues need resolution before merge: a real-object test in the mock-only folder and a potential callback double-registration on hot-reload paths.
  • The CORS feature is well-tested and its env-var precedence logic is correct. The key-rotation distributed lock is clean with proper finally-block release. The _check_for_os_environ_vars list-item fix is a silent-bug correction that is a behavior change for existing deployments with os.environ/ refs inside YAML lists — this needs release-note callout. The callback eager-instantiation leaves the raw string in litellm.callbacks alongside the live instance, which is fragile if any downstream hot-reload path re-processes the string. The test test_initialize_callbacks_on_proxy_instantiates_otel creates a real OpenTelemetry object in the designated mock-only tests/test_litellm/ directory, which can cause flaky CI failures.
  • litellm/proxy/common_utils/callback_utils.py (potential double-registration), tests/test_litellm/proxy/common_utils/test_callback_utils.py (real object in mock-only folder), litellm/proxy/proxy_server.py (_check_for_os_environ_vars behaviour change)

Important Files Changed

Filename Overview
litellm/proxy/proxy_cli.py Adds early CORS configuration loading from config.yaml before proxy_server is imported; includes _normalize_cors_value, _apply_cors_settings_from_general_settings, _process_config_includes, and _resolve_os_environ_refs helpers. Include-file merge precedence (scalar overwrite) is undocumented. asyncio.run() inside _load_general_settings_for_early_cors for GCS path noted in prior threads.
litellm/proxy/proxy_server.py Replaces hardcoded origins=["*"] / allow_credentials=True with env-var-driven CORS configuration; fixes a silent bug in _check_for_os_environ_vars where list items were never actually updated (reassigned loop variable was discarded). The _check_for_os_environ_vars fix is a behaviour change for existing deployments using os.environ/ refs inside YAML lists.
litellm/proxy/common_utils/callback_utils.py Eagerly instantiates known custom-logger callbacks (e.g. "otel") at startup via _add_custom_logger_callback_to_specific_event, then still appends the string to imported_list/litellm.callbacks. The dual-registration pattern (instance in success/failure lists + string in callbacks) risks double-registration if any code path re-processes the string on hot-reload or async model load.
litellm/proxy/common_utils/key_rotation_manager.py Adds optional distributed-lock support to KeyRotationManager.process_rotations() via PodLockManager; lock is acquired before rotation and released in a finally block. Lock TTL caveat is documented. New tests cover acquire, skip-when-held, release-on-success, release-on-failure, and no-lock-manager paths.
litellm/litellm_core_utils/prompt_templates/factory.py Fixes map_system_message_pt to handle list-typed message content (Anthropic pass-through format) and None content (tool-call messages) by introducing _get_content_as_str; avoids in-place mutation of the caller's message dicts by shallow-copying before modification. Remaining formatting-only changes are cosmetic (black/ruff reformatting).
litellm/integrations/anthropic_cache_control_hook.py Adds a new Case 1 for combined role+index targeting (e.g. "last assistant message") with correct IndexError handling; existing index-only and role-only cases are preserved as Cases 2 and 3 respectively.
litellm/integrations/arize/_utils.py Extends _set_usage_outputs to read completion_tokens_details/output_tokens_details and prompt_tokens_details/input_tokens_details for reasoning and cache-read token counts. Uses or short-circuit for field lookup — already flagged in prior thread; no new change needed here.
litellm/llms/anthropic/chat/handler.py Wraps SSE chunk JSON parsing in a try/except to gracefully ignore non-JSON data lines (e.g. data: [DONE] from DeepSeek-compatible endpoints) instead of crashing. Safe and well-scoped fix.
tests/test_litellm/proxy/common_utils/test_callback_utils.py Adds tests for early callback instantiation; test_initialize_callbacks_on_proxy_instantiates_otel creates a real OpenTelemetry instance without mocking in the mock-only tests/test_litellm/ folder, potentially violating the CI/CD mock-only rule (Rule 07fb23f7).
tests/proxy_unit_tests/test_cors_config.py Comprehensive unit tests for all CORS helper functions using mocks, temp files, and monkeypatching; no real network calls. Good coverage of edge cases (empty list, env var precedence, invalid types).

Sequence Diagram

sequenceDiagram
    participant CLI as proxy_cli.py (run_server)
    participant EarlyLoad as _load_general_settings_for_early_cors
    participant Env as os.environ
    participant Server as proxy_server.py (module import)
    participant CORS as CORSMiddleware

    CLI->>EarlyLoad: config file path / bucket name
    EarlyLoad->>EarlyLoad: yaml.safe_load + _process_config_includes
    EarlyLoad->>EarlyLoad: _resolve_os_environ_refs (os.environ/ refs)
    EarlyLoad-->>CLI: general_settings dict

    CLI->>CLI: _apply_cors_settings_from_general_settings(general_settings)
    CLI->>CLI: _normalize_cors_value(cors_allow_origins)
    CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_ORIGINS) [only if unset]
    CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_CREDENTIALS)
    CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_METHODS)
    CLI->>Env: _set_env_var_if_unset(LITELLM_CORS_ALLOW_HEADERS)

    Note over CLI,Server: First import of proxy_server reads env vars at module level
    CLI->>Server: import app (triggers module-level CORS init)
    Server->>Env: os.getenv(LITELLM_CORS_ALLOW_ORIGINS)
    Server->>Server: _get_cors_allow_list() → cors_allow_origins
    Server->>Server: _get_cors_allow_credentials() → cors_allow_credentials
    Server->>Server: wildcard + credentials guard check
    Server->>CORS: app.add_middleware(CORSMiddleware, allow_origins=..., allow_credentials=...)
Loading

Reviews (2): Last reviewed commit: "Merge pull request #23782 from voidborne..." | Re-trigger Greptile

Comment on lines +1148 to +1152
if raw_value.strip() == "":
return False
return raw_value.strip().lower() in {"1", "true", "yes"}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Credentials default flips for users with explicit wildcard origins

_get_cors_allow_credentials returns not origins_were_configured, meaning that as soon as any value is set for LITELLM_CORS_ALLOW_ORIGINS — including "*"allow_credentials silently flips from True to False unless the user also sets LITELLM_CORS_ALLOW_CREDENTIALS=true.

For a user who migrates to the new config and writes:

general_settings:
  cors_allow_origins: "*"

they will see allow_credentials=False (CORS credentialed requests blocked) even though their intent was "same as default". While wildcard + credentials is rejected by browsers and was technically a broken setting, the silent default-flip may surprise users who were relying on the permissive default and are now pinning the origin to * explicitly. The PR description and docs should call this out clearly, and users who set explicit origins via the new env vars should be encouraged to also set cors_allow_credentials explicitly.

Rule Used: What: avoid backwards-incompatible changes without... (source)

Comment on lines +70 to +80
if value is None:
return None
if isinstance(value, str):
return ",".join([item.strip() for item in value.split(",") if item.strip()])
if isinstance(value, list):
if not all(isinstance(item, str) for item in value):
raise ValueError(
f"Invalid CORS setting for '{setting_name}': expected a list of strings."
)
return ",".join([item.strip() for item in value if item.strip()])
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Empty list normalizes to "" — sets env var, may surprise users

When a user passes cors_allow_origins: [], _normalize_cors_value returns "" (an empty string), and _set_env_var_if_unset writes that to LITELLM_CORS_ALLOW_ORIGINS. _get_cors_allow_list then interprets the empty string as [], which blocks all cross-origin traffic.

This is intentional (as confirmed by test_normalize_cors_value_empty_list), but the asymmetry with None (which leaves the env var unset and falls back to the wildcard default) is easy to misread. Consider adding an explicit note in the ConfigGeneralSettings field description explaining that an empty list explicitly disables all CORS (rather than falling back to the wildcard default).

Comment on lines +239 to +243
completion_tokens_details = usage.get("completion_tokens_details") or usage.get(
"output_tokens_details"
)
if completion_tokens_details is not None:
reasoning_tokens = getattr(completion_tokens_details, "reasoning_tokens", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 or short-circuits on falsy objects, may skip valid completion_tokens_details

completion_tokens_details = usage.get("completion_tokens_details") or usage.get(
    "output_tokens_details"
)

If completion_tokens_details is present but evaluates as falsy (e.g. a CompletionTokensDetailsWrapper() instance with all-zero fields where __bool__ returns False, or an empty dict {}), Python's or will silently fall through to output_tokens_details. This is unlikely in practice today, but a safer pattern would be an explicit None check:

Suggested change
completion_tokens_details = usage.get("completion_tokens_details") or usage.get(
"output_tokens_details"
)
if completion_tokens_details is not None:
reasoning_tokens = getattr(completion_tokens_details, "reasoning_tokens", None)
completion_tokens_details = usage.get("completion_tokens_details")
if completion_tokens_details is None:
completion_tokens_details = usage.get("output_tokens_details")

The same applies to prompt_tokens_details / input_tokens_details a few lines below.

Comment on lines 447 to +450
self.db_spend_update_writer.pod_lock_manager.redis_cache = redis_cache
from litellm.proxy.proxy_server import key_rotation_pod_lock_manager

if key_rotation_pod_lock_manager is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Circular import risk: importing from proxy_server inside a function in utils.py

from litellm.proxy.proxy_server import key_rotation_pod_lock_manager

proxy_server.py already imports heavily from proxy/utils.py. Adding a reverse import — even inside a function body — creates a potential circular dependency that can silently resolve to a stale module reference at the time the function executes. Consider storing key_rotation_pod_lock_manager in a shared lightweight module (or passing it as a parameter to ProxyLogging) rather than pulling it back from proxy_server at call time.

Comment on lines +191 to +202
import. This early pass supports literal values, os.environ/ references,
and local file-path include files. Secret-manager-backed values and include
directives inside S3/GCS configs are only resolved later in startup and
therefore cannot affect the initial middleware config.
"""
config: Optional[dict] = None

if os.environ.get("LITELLM_CONFIG_BUCKET_NAME") is not None:
import asyncio

from litellm.proxy.common_utils.load_config_utils import (
get_config_file_contents_from_gcs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 asyncio.run() fails inside a running event loop

The GCS bucket path calls asyncio.run(get_config_file_contents_from_gcs(...)). This raises RuntimeError if an event loop is already running (e.g. in pytest-asyncio or if startup is ever refactored to be async). The S3 path is synchronous and unaffected.

Consider using nest_asyncio or restructuring this call to use a sync GCS wrapper already available in the codebase, so the function behaves consistently across all config sources.

Comment on lines +93 to +150

Regression test for: when store_model_in_db=true, OTEL callback was
never instantiated because initialize_callbacks_on_proxy() only added
the string "otel" to litellm.callbacks without creating the instance.
"""
import litellm
from litellm.proxy.common_utils.callback_utils import (
initialize_callbacks_on_proxy,
)
from litellm.integrations.opentelemetry import OpenTelemetry
from litellm.litellm_core_utils import litellm_logging
from litellm.proxy import proxy_server

# Save original state
original_callbacks = litellm.callbacks[:]
original_success = litellm.success_callback[:]
original_async_success = litellm._async_success_callback[:]
original_failure = litellm.failure_callback[:]
original_async_failure = litellm._async_failure_callback[:]
original_otel_logger = getattr(proxy_server, "open_telemetry_logger", None)
original_in_memory_loggers = litellm_logging._in_memory_loggers[:]

try:
# Clear callbacks
litellm.callbacks = []
litellm.success_callback = []
litellm._async_success_callback = []
litellm.failure_callback = []
litellm._async_failure_callback = []

initialize_callbacks_on_proxy(
value=["otel"],
premium_user=False,
config_file_path="",
litellm_settings={},
)

# Verify an OpenTelemetry instance exists in success callbacks
otel_in_success = any(
isinstance(cb, OpenTelemetry)
for cb in litellm.success_callback + litellm._async_success_callback
)
assert otel_in_success, (
"OpenTelemetry instance not found in success callbacks. "
f"success_callback={litellm.success_callback}, "
f"_async_success_callback={litellm._async_success_callback}"
)

# Verify an OpenTelemetry instance exists in failure callbacks
otel_in_failure = any(
isinstance(cb, OpenTelemetry)
for cb in litellm.failure_callback + litellm._async_failure_callback
)
assert otel_in_failure, "OpenTelemetry instance not found in failure callbacks."

# Verify open_telemetry_logger is set on proxy_server
assert proxy_server.open_telemetry_logger is not None, (
"proxy_server.open_telemetry_logger should be set after "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Real OpenTelemetry instance created in mock-only test folder

test_initialize_callbacks_on_proxy_instantiates_otel calls initialize_callbacks_on_proxy(value=["otel"], ...) without mocking _add_custom_logger_callback_to_specific_event or the OpenTelemetry class itself. This causes a real OpenTelemetry instance to be constructed inside tests/test_litellm/, which is supposed to contain only mock-based tests to ensure reliable CI/CD execution.

While the OpenTelemetry.__init__ in LiteLLM doesn't typically make network connections immediately, it does set up real span processors and exporters that may have initialization-time side effects (e.g., spawning background threads, attempting OTLP endpoint connections if OTEL_EXPORTER_OTLP_ENDPOINT is set in the environment). This can cause flaky failures in CI if any OTEL env vars are set.

The complementary test test_initialize_callbacks_on_proxy_calls_instantiation_for_known_callbacks (lines 154–176) correctly mocks this path. Consider aligning the first test to the same pattern:

@patch("litellm.utils._add_custom_logger_callback_to_specific_event")
def test_initialize_callbacks_on_proxy_instantiates_otel(mock_add_callback):
    from litellm.proxy.common_utils.callback_utils import initialize_callbacks_on_proxy
    initialize_callbacks_on_proxy(
        value=["otel"],
        premium_user=False,
        config_file_path="",
        litellm_settings={},
    )
    # Verify early-instantiation path is taken
    assert mock_add_callback.call_count == 2
    mock_add_callback.assert_any_call("otel", "success")
    mock_add_callback.assert_any_call("otel", "failure")

Rule Used: What: prevent any tests from being added here that... (source)

Comment on lines 46 to 78
isinstance(callback, str)
and callback in litellm._known_custom_logger_compatible_callbacks
):
# Eagerly instantiate the callback class (e.g. OpenTelemetry)
# so that proxy-level globals like open_telemetry_logger are
# set at startup. Without this, store_model_in_db=true causes
# the async model-loading path to skip callback instantiation
# entirely, leaving the logger as None.
#
# Each event is registered independently so a failure in one
# does not leave the callback half-registered.
from litellm.utils import (
_add_custom_logger_callback_to_specific_event,
)

for event in ("success", "failure"):
try:
_add_custom_logger_callback_to_specific_event(callback, event)
except Exception as e:
verbose_proxy_logger.error(
f"Failed to initialize callback '{callback}' "
f"for {event} event at startup: {e}. "
"Check that the required environment variables "
"are set."
)
# Always add to imported_list so litellm.callbacks stays in
# sync — it is read by health endpoints, hot-reload, and
# spend tracking. On success the instance is already in the
# success/failure lists; the string here keeps the canonical
# config list complete.
imported_list.append(callback)
elif isinstance(callback, str) and callback == "presidio":
from litellm.proxy.guardrails.guardrail_hooks.presidio import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Callback string in litellm.callbacks may cause double-registration on hot-reload

The new code eagerly calls _add_custom_logger_callback_to_specific_event(callback, event) to add the live instance to success_callback/failure_callback, and then also appends the raw string (e.g. "otel") to imported_list. imported_list later becomes litellm.callbacks.

If any downstream code path (e.g., hot-reload, store_model_in_db async load, or spend-tracking startup) re-processes litellm.callbacks and calls _add_custom_logger_callback_to_specific_event for each string entry it finds, the callback instance will be registered a second time — leading to duplicate log events or double-counting in spend tracking.

The comment explains the intent ("the string keeps the canonical config list complete"), but without an explicit guard that prevents the string from being re-instantiated on subsequent passes, this is fragile. Consider either:

  1. Replacing the string in imported_list with the already-created instance (so litellm.callbacks contains the object, not the string), or
  2. Adding a dedup check in _add_custom_logger_callback_to_specific_event that skips registration when an instance of the same class already exists in the target list.

Comment on lines 2556 to 2572
config=value, depth=depth + 1, max_depth=max_depth
)
elif isinstance(value, list):
resolved_list = []
for item in value:
if isinstance(item, dict):
item = self._check_for_os_environ_vars(
config=item, depth=depth + 1, max_depth=max_depth
resolved_list.append(
self._check_for_os_environ_vars(
config=item, depth=depth + 1, max_depth=max_depth
)
)
elif isinstance(item, str) and item.startswith("os.environ/"):
resolved_list.append(get_secret(item))
else:
resolved_list.append(item)
config[key] = resolved_list
# if the value is a string and starts with "os.environ/" - then it's an environment variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent bug fix in _check_for_os_environ_vars — original list items were never mutated

The original code reassigned the local loop variable item without writing it back to the list:

# BEFORE (broken): item reassignment is lost
for item in value:
    if isinstance(item, dict):
        item = self._check_for_os_environ_vars(config=item, ...)
# config[key] is never updated — all resolved dict items are dropped

The new code correctly accumulates results into resolved_list and reassigns config[key]. Additionally, os.environ/ references inside list string elements are now resolved (they were silently skipped before).

This is a meaningful correctness fix, but it represents a behaviour change for existing deployments that have lists of strings or dicts with os.environ/ references in their config. Any YAML such as:

model_list:
  - os.environ/MY_MODEL_ALIAS

will now be resolved where it was previously a no-op. Make sure this change is called out clearly in release notes so users can verify their configs still behave as expected.

Comment on lines +155 to +171

del config["include"]
return config


def _resolve_os_environ_refs(config: dict) -> dict:
for key, value in config.items():
if isinstance(value, dict):
config[key] = _resolve_os_environ_refs(value)
elif isinstance(value, list):
resolved_list = []
for item in value:
if isinstance(item, dict):
resolved_list.append(_resolve_os_environ_refs(item))
elif isinstance(item, str) and item.startswith("os.environ/"):
from litellm import get_secret_str

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _process_config_includes does not resolve os.environ/ refs in included files before merging

_process_config_includes merges the raw YAML from included files into the main config dict, and then _resolve_os_environ_refs is called on the merged result. This ordering is fine for the current scope.

However, included files are loaded with yaml.safe_load and their values are merged directly. If an included file contains a top-level key that already exists in the main config as a non-list (e.g., a string), the included value silently overwrites the main config value (line 169: config[key] = value). There is no warning to the operator. This mirrors the behaviour of ProxyConfig._load_and_merge_include_configs but is worth documenting via an inline comment so future maintainers understand the intentional merge precedence (included file wins for scalar values, lists are extended).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.