Skip to content

Commit 22b6a82

Browse files
committed
docs fix
1 parent 57c6f8d commit 22b6a82

File tree

1 file changed

+290
-0
lines changed
  • docs/my-website/release_notes/v1.77.7-stable

1 file changed

+290
-0
lines changed
Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
---
2+
title: "v1.77.7-rc - Performance Optimizations & Claude Sonnet 4.5"
3+
slug: "v1-77-7"
4+
date: 2025-10-04T10:00:00
5+
authors:
6+
- name: Krrish Dholakia
7+
title: CEO, LiteLLM
8+
url: https://www.linkedin.com/in/krish-d/
9+
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
10+
- name: Ishaan Jaff
11+
title: CTO, LiteLLM
12+
url: https://www.linkedin.com/in/reffajnaahsi/
13+
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
14+
- name: Alexsander Hamir
15+
title: Backend Performance Engineer
16+
url: https://www.linkedin.com/in/alexsander-baptista/
17+
image_url: https://media.licdn.com/dms/image/v2/D5603AQGXnziu4kqNCQ/profile-displayphoto-crop_800_800/B56ZkxEcuOKEAI-/0/1757464874550?e=1762387200&v=beta&t=9SNXLsWhx8OnYPAMQ9fqAr02oevDYEAL2vMYg2f9ieg
18+
- name: Achintya Srivastava
19+
title: Fullstack Engineer
20+
url: https://www.linkedin.com/in/achintya-rajan/
21+
image_url: https://media.licdn.com/dms/image/v2/D5603AQGdkEeyJTdljw/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1716271140869?e=1762387200&v=beta&t=9gOoLPeqR2E5z3KSX61EUj3HVZXmgo87vhVuSHeffjc
22+
23+
hide_table_of_contents: false
24+
---
25+
26+
import Image from '@theme/IdealImage';
27+
import Tabs from '@theme/Tabs';
28+
import TabItem from '@theme/TabItem';
29+
30+
## Deploy this version
31+
32+
<Tabs>
33+
<TabItem value="docker" label="Docker">
34+
35+
``` showLineNumbers title="docker run litellm"
36+
docker run \
37+
-e STORE_MODEL_IN_DB=True \
38+
-p 4000:4000 \
39+
ghcr.io/berriai/litellm:v1.77.7.rc.1
40+
```
41+
42+
</TabItem>
43+
44+
<TabItem value="pip" label="Pip">
45+
46+
``` showLineNumbers title="pip install litellm"
47+
pip install litellm==1.77.7.rc.1
48+
```
49+
50+
</TabItem>
51+
</Tabs>
52+
53+
---
54+
55+
## Key Highlights
56+
57+
- **Major Performance Improvements** - Router optimization reducing P99 latency by 62.5%, cache improvements from O(n*log(n)) to O(log(n))
58+
- **Claude Sonnet 4.5** - Support for Anthropic's new Claude Sonnet 4.5 model family with 200K+ context and tiered pricing
59+
- **MCP Gateway Enhancements** - Fine-grained tool control, server permissions, and forwardable headers
60+
- **AMD Lemonade & Nvidia NIM** - New provider support for AMD Lemonade and Nvidia NIM Rerank
61+
- **GitLab Prompt Management** - GitLab-based prompt management integration
62+
63+
## New Models / Updated Models
64+
65+
#### New Model Support
66+
67+
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
68+
| -------- | ----- | -------------- | ------------------- | -------------------- | -------- |
69+
| Anthropic | `claude-sonnet-4-5` | 200K | $3.00 | $15.00 | Chat, reasoning, vision, function calling, prompt caching |
70+
| Anthropic | `claude-sonnet-4-5-20250929` | 200K | $3.00 | $15.00 | Chat, reasoning, vision, function calling, prompt caching |
71+
| Bedrock | `eu.anthropic.claude-sonnet-4-5-20250929-v1:0` | 200K | $3.00 | $15.00 | Chat, reasoning, vision, function calling, prompt caching |
72+
| Azure AI | `azure_ai/grok-4` | 131K | $5.50 | $27.50 | Chat, reasoning, function calling, web search |
73+
| Azure AI | `azure_ai/grok-4-fast-reasoning` | 131K | $5.80 | $2,900.00 | Chat, reasoning, function calling, web search |
74+
| Azure AI | `azure_ai/grok-4-fast-non-reasoning` | 131K | $5.00 | $2,500.00 | Chat, function calling, web search |
75+
| Azure AI | `azure_ai/grok-code-fast-1` | 131K | $3.50 | $17.50 | Chat, function calling, web search |
76+
| Groq | `groq/moonshotai/kimi-k2-instruct-0905` | Context varies | Pricing varies | Pricing varies | Chat, function calling |
77+
| Ollama | Ollama Cloud models | Varies | Free | Free | Self-hosted models via Ollama Cloud |
78+
79+
#### Features
80+
81+
- **[Anthropic](../../docs/providers/anthropic)**
82+
- Add new claude-sonnet-4-5 model family with tiered pricing above 200K tokens - [PR #15041](https://github.com/BerriAI/litellm/pull/15041)
83+
- Add anthropic/claude-sonnet-4-5 to model price json with prompt caching support - [PR #15049](https://github.com/BerriAI/litellm/pull/15049)
84+
- Add 200K prices for Sonnet 4.5 - [PR #15140](https://github.com/BerriAI/litellm/pull/15140)
85+
- Add cost tracking for /v1/messages in streaming response - [PR #15102](https://github.com/BerriAI/litellm/pull/15102)
86+
- Add /v1/messages/count_tokens to Anthropic routes for non-admin user access - [PR #15034](https://github.com/BerriAI/litellm/pull/15034)
87+
- **[Gemini](../../docs/providers/gemini)**
88+
- Add full support for native Gemini API translation - [PR #15029](https://github.com/BerriAI/litellm/pull/15029)
89+
- Add Gemini generateContent passthrough cost tracking - [PR #15014](https://github.com/BerriAI/litellm/pull/15014)
90+
- Add streamGenerateContent cost tracking in passthrough - [PR #15199](https://github.com/BerriAI/litellm/pull/15199)
91+
- Ignore type param for gemini tools - [PR #15022](https://github.com/BerriAI/litellm/pull/15022)
92+
- **[Vertex AI](../../docs/providers/vertex)**
93+
- Add LiteLLM Overhead metric for VertexAI - [PR #15040](https://github.com/BerriAI/litellm/pull/15040)
94+
- Add cost tracking for Vertex AI Passthrough `/predict` endpoint - [PR #15019](https://github.com/BerriAI/litellm/pull/15019)
95+
- Add cost tracking for Vertex AI Live API WebSocket Passthrough - [PR #14956](https://github.com/BerriAI/litellm/pull/14956)
96+
- Support googlemap grounding in vertex ai - [PR #15179](https://github.com/BerriAI/litellm/pull/15179)
97+
- **[Azure](../../docs/providers/azure)**
98+
- Add azure_ai grok-4 model family - [PR #15137](https://github.com/BerriAI/litellm/pull/15137)
99+
- Use the `extra_query` parameter for GET requests in Azure Batch - [PR #14997](https://github.com/BerriAI/litellm/pull/14997)
100+
- Use extra_query for download results (Batch API) - [PR #15025](https://github.com/BerriAI/litellm/pull/15025)
101+
- Add support for Azure AD token-based authorization - [PR #14813](https://github.com/BerriAI/litellm/pull/14813)
102+
- **[Ollama](../../docs/providers/ollama)**
103+
- Add ollama cloud models - [PR #15008](https://github.com/BerriAI/litellm/pull/15008)
104+
- **[Groq](../../docs/providers/groq)**
105+
- Add groq/moonshotai/kimi-k2-instruct-0905 - [PR #15079](https://github.com/BerriAI/litellm/pull/15079)
106+
- **[OpenAI](../../docs/providers/openai)**
107+
- Add support for GPT 5 codex models - [PR #14841](https://github.com/BerriAI/litellm/pull/14841)
108+
- **[DeepInfra](../../docs/providers/deepinfra)**
109+
- Update DeepInfra model data refresh with latest pricing - [PR #14939](https://github.com/BerriAI/litellm/pull/14939)
110+
- **[Bedrock](../../docs/providers/bedrock)**
111+
- Add JP Cross-Region Inference - [PR #15188](https://github.com/BerriAI/litellm/pull/15188)
112+
- Add "eu.anthropic.claude-sonnet-4-5-20250929-v1:0" - [PR #15181](https://github.com/BerriAI/litellm/pull/15181)
113+
- Add twelvelabs bedrock Async Invoke Support - [PR #14871](https://github.com/BerriAI/litellm/pull/14871)
114+
- **[Nvidia NIM](../../docs/providers/nvidia_nim)**
115+
- Add Nvidia NIM Rerank Support - [PR #15152](https://github.com/BerriAI/litellm/pull/15152)
116+
117+
### Bug Fixes
118+
119+
- **[VLLM](../../docs/providers/vllm)**
120+
- Fix response_format bug in hosted vllm audio_transcription - [PR #15010](https://github.com/BerriAI/litellm/pull/15010)
121+
- Fix passthrough of atranscription into kwargs going to upstream provider - [PR #15005](https://github.com/BerriAI/litellm/pull/15005)
122+
- **[OCI](../../docs/providers/oci)**
123+
- Fix OCI Generative AI Integration when using Proxy - [PR #15072](https://github.com/BerriAI/litellm/pull/15072)
124+
- **General**
125+
- Fix: Authorization header to use correct "Bearer" capitalization - [PR #14764](https://github.com/BerriAI/litellm/pull/14764)
126+
- Bug fix: gpt-5-chat-latest has incorrect max_input_tokens value - [PR #15116](https://github.com/BerriAI/litellm/pull/15116)
127+
- Fix missing HTTPException import - [PR #15111](https://github.com/BerriAI/litellm/pull/15111)
128+
- Fix: model_group not always present in litellm_params, and metadata - [PR #15108](https://github.com/BerriAI/litellm/pull/15108)
129+
- Update request handling for original exceptions - [PR #15013](https://github.com/BerriAI/litellm/pull/15013)
130+
- Remove invalid vertex -latest models - [PR #15043](https://github.com/BerriAI/litellm/pull/15043)
131+
132+
#### New Provider Support
133+
134+
- **[AMD Lemonade](../../docs/providers/lemonade)**
135+
- Add AMD Lemonade provider support - [PR #14840](https://github.com/BerriAI/litellm/pull/14840)
136+
137+
---
138+
139+
## LLM API Endpoints
140+
141+
#### Features
142+
143+
- **[Responses API](../../docs/response_api)**
144+
- Return Cost for Responses API Streaming requests - [PR #15053](https://github.com/BerriAI/litellm/pull/15053)
145+
146+
- **General**
147+
- Preserve Whitespace Characters in Model Response Streams - [PR #15160](https://github.com/BerriAI/litellm/pull/15160)
148+
- Add provider name to payload specification - [PR #15130](https://github.com/BerriAI/litellm/pull/15130)
149+
150+
---
151+
152+
## Management Endpoints / UI
153+
154+
#### Features
155+
156+
- **Virtual Keys**
157+
- Fix Session Token Cookie Infinite Logout Loop - [PR #15146](https://github.com/BerriAI/litellm/pull/15146)
158+
- Ensure LLM_API_KEYs can access pass through routes - [PR #15115](https://github.com/BerriAI/litellm/pull/15115)
159+
160+
- **Models + Endpoints**
161+
- Ensure OCI secret fields not shared on /models and /v1/models endpoints - [PR #15085](https://github.com/BerriAI/litellm/pull/15085)
162+
- Add snowflake on UI - [PR #15083](https://github.com/BerriAI/litellm/pull/15083)
163+
- Make UI theme settings publicly accessible for custom branding - [PR #15074](https://github.com/BerriAI/litellm/pull/15074)
164+
165+
- **Admin Settings**
166+
- Ensure OTEL settings are saved in DB after set on UI - [PR #15118](https://github.com/BerriAI/litellm/pull/15118)
167+
- Top api key tags - [PR #15151](https://github.com/BerriAI/litellm/pull/15151), [PR #15156](https://github.com/BerriAI/litellm/pull/15156)
168+
169+
#### Bugs
170+
171+
- **Dashboard** - Fix LiteLLM model name fallback in dashboard overview - [PR #14998](https://github.com/BerriAI/litellm/pull/14998)
172+
- **Passthrough API** - Ensure query params are forwarded from origin url to downstream request - [PR #15087](https://github.com/BerriAI/litellm/pull/15087)
173+
174+
---
175+
176+
## Logging / Guardrail / Prompt Management Integrations
177+
178+
#### Features
179+
180+
- **[OpenTelemetry](../../docs/observability/otel)**
181+
- Use generation_name for span naming in logging method - [PR #14799](https://github.com/BerriAI/litellm/pull/14799)
182+
- **[Langfuse](../../docs/proxy/logging#langfuse)**
183+
- Handle non-serializable objects in Langfuse logging - [PR #15148](https://github.com/BerriAI/litellm/pull/15148)
184+
- Set usage_details.total in langfuse integration - [PR #15015](https://github.com/BerriAI/litellm/pull/15015)
185+
186+
#### Guardrails
187+
188+
- **[Javelin](../../docs/proxy/guardrails)**
189+
- Add Javelin standalone guardrails integration for LiteLLM Proxy - [PR #14983](https://github.com/BerriAI/litellm/pull/14983)
190+
- Add logging for important status fields in guardrails - [PR #15090](https://github.com/BerriAI/litellm/pull/15090)
191+
- Don't run post_call guardrail if no text returned from Bedrock - [PR #15106](https://github.com/BerriAI/litellm/pull/15106)
192+
193+
#### Prompt Management
194+
195+
- **[GitLab](../../docs/proxy/prompt_management)**
196+
- GitLab based Prompt manager - [PR #14988](https://github.com/BerriAI/litellm/pull/14988)
197+
198+
---
199+
200+
## Spend Tracking, Budgets and Rate Limiting
201+
202+
- **Cost Tracking**
203+
- Proxy: end user cost tracking in the responses API - [PR #15124](https://github.com/BerriAI/litellm/pull/15124)
204+
- **Parallel Request Limiter v3**
205+
- Use well known redis cluster hashing algorithm - [PR #15052](https://github.com/BerriAI/litellm/pull/15052)
206+
- Fixes to dynamic rate limiter v3 - add saturation detection - [PR #15119](https://github.com/BerriAI/litellm/pull/15119)
207+
- Dynamic Rate Limiter v3 - fixes for detecting saturation + fixes for post saturation behavior - [PR #15192](https://github.com/BerriAI/litellm/pull/15192)
208+
- **Teams**
209+
- Add model specific tpm/rpm limits to teams on LiteLLM - [PR #15044](https://github.com/BerriAI/litellm/pull/15044)
210+
- **Configuration**
211+
- Add max requests env var - [PR #15007](https://github.com/BerriAI/litellm/pull/15007)
212+
213+
---
214+
215+
## MCP Gateway
216+
217+
- **Server Configuration**
218+
- Specify forwardable headers, specify allowed/disallowed tools for MCP servers - [PR #15002](https://github.com/BerriAI/litellm/pull/15002)
219+
- Enforce server permissions on call tools - [PR #15044](https://github.com/BerriAI/litellm/pull/15044)
220+
- MCP Gateway Fine-grained Tools Addition - [PR #15153](https://github.com/BerriAI/litellm/pull/15153)
221+
- **Bug Fixes**
222+
- Remove servername prefix mcp tools tests - [PR #14986](https://github.com/BerriAI/litellm/pull/14986)
223+
- Resolve regression with duplicate Mcp-Protocol-Version header - [PR #15050](https://github.com/BerriAI/litellm/pull/15050)
224+
- Fix test_mcp_server.py - [PR #15183](https://github.com/BerriAI/litellm/pull/15183)
225+
226+
---
227+
228+
## Performance / Loadbalancing / Reliability improvements
229+
230+
- **Router Optimizations**
231+
- **+62.5% P99 Latency Improvement** - Remove router inefficiencies (from O(M*N) to O(1)) - [PR #15046](https://github.com/BerriAI/litellm/pull/15046)
232+
- Remove hasattr checks in Router - [PR #15082](https://github.com/BerriAI/litellm/pull/15082)
233+
- Remove Double Lookups - [PR #15084](https://github.com/BerriAI/litellm/pull/15084)
234+
- Optimize _filter_cooldown_deployments from O(n×m + k×n) to O(n) - [PR #15091](https://github.com/BerriAI/litellm/pull/15091)
235+
- Optimize unhealthy deployment filtering in retry path (O(n*m) → O(n+m)) - [PR #15110](https://github.com/BerriAI/litellm/pull/15110)
236+
- **Cache Optimizations**
237+
- Reduce complexity of InMemoryCache.evict_cache from O(n*log(n)) to O(log(n)) - [PR #15000](https://github.com/BerriAI/litellm/pull/15000)
238+
- Avoiding expensive operations when cache isn't available - [PR #15182](https://github.com/BerriAI/litellm/pull/15182)
239+
- **Metrics & Monitoring**
240+
- LiteLLM Overhead metric tracking - Add support for tracking litellm overhead on cache hits - [PR #15045](https://github.com/BerriAI/litellm/pull/15045)
241+
242+
---
243+
244+
## Documentation Updates
245+
246+
- **Provider Documentation**
247+
- Update litellm docs from latest release - [PR #15004](https://github.com/BerriAI/litellm/pull/15004)
248+
- Add missing api_key parameter - [PR #15058](https://github.com/BerriAI/litellm/pull/15058)
249+
- **General Documentation**
250+
- Use docker compose instead of docker-compose - [PR #15024](https://github.com/BerriAI/litellm/pull/15024)
251+
- Add railtracks to projects that are using litellm - [PR #15144](https://github.com/BerriAI/litellm/pull/15144)
252+
- Perf: Last week improvement - [PR #15193](https://github.com/BerriAI/litellm/pull/15193)
253+
- Sync models GitHub documentation with Loom video and cross-reference - [PR #15191](https://github.com/BerriAI/litellm/pull/15191)
254+
255+
---
256+
257+
## Security Fixes
258+
259+
- **JWT Token Security** - Don't log JWT SSO token on .info() log - [PR #15145](https://github.com/BerriAI/litellm/pull/15145)
260+
261+
---
262+
263+
## New Contributors
264+
265+
* @herve-ves made their first contribution in [PR #14998](https://github.com/BerriAI/litellm/pull/14998)
266+
* @wenxi-onyx made their first contribution in [PR #15008](https://github.com/BerriAI/litellm/pull/15008)
267+
* @jpetrucciani made their first contribution in [PR #15005](https://github.com/BerriAI/litellm/pull/15005)
268+
* @abhijitjavelin made their first contribution in [PR #14983](https://github.com/BerriAI/litellm/pull/14983)
269+
* @ZeroClover made their first contribution in [PR #15039](https://github.com/BerriAI/litellm/pull/15039)
270+
* @cedarm made their first contribution in [PR #15043](https://github.com/BerriAI/litellm/pull/15043)
271+
* @Isydmr made their first contribution in [PR #15025](https://github.com/BerriAI/litellm/pull/15025)
272+
* @serializer made their first contribution in [PR #15013](https://github.com/BerriAI/litellm/pull/15013)
273+
* @eddierichter-amd made their first contribution in [PR #14840](https://github.com/BerriAI/litellm/pull/14840)
274+
* @malags made their first contribution in [PR #15000](https://github.com/BerriAI/litellm/pull/15000)
275+
* @henryhwang made their first contribution in [PR #15029](https://github.com/BerriAI/litellm/pull/15029)
276+
* @plafleur made their first contribution in [PR #15111](https://github.com/BerriAI/litellm/pull/15111)
277+
* @tyler-liner made their first contribution in [PR #14799](https://github.com/BerriAI/litellm/pull/14799)
278+
* @Amir-R25 made their first contribution in [PR #15144](https://github.com/BerriAI/litellm/pull/15144)
279+
* @georg-wolflein made their first contribution in [PR #15124](https://github.com/BerriAI/litellm/pull/15124)
280+
* @niharm made their first contribution in [PR #15140](https://github.com/BerriAI/litellm/pull/15140)
281+
* @anthony-liner made their first contribution in [PR #15015](https://github.com/BerriAI/litellm/pull/15015)
282+
* @rishiganesh2002 made their first contribution in [PR #15153](https://github.com/BerriAI/litellm/pull/15153)
283+
* @danielaskdd made their first contribution in [PR #15160](https://github.com/BerriAI/litellm/pull/15160)
284+
* @JVenberg made their first contribution in [PR #15146](https://github.com/BerriAI/litellm/pull/15146)
285+
* @speglich made their first contribution in [PR #15072](https://github.com/BerriAI/litellm/pull/15072)
286+
* @daily-kim made their first contribution in [PR #14764](https://github.com/BerriAI/litellm/pull/14764)
287+
288+
---
289+
290+
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.77.5.rc.4...v1.77.7.rc.1)**

0 commit comments

Comments
 (0)