Skip to content

Commit 553801c

Browse files
committed
docs
1 parent b7ca138 commit 553801c

File tree

2 files changed

+33
-25
lines changed

2 files changed

+33
-25
lines changed

docs/my-website/release_notes/v1.75.5-stable/index.md

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -50,30 +50,6 @@ pip install litellm==1.75.5.post2
5050
- **Oracle Cloud Infrastructure** - New LLM provider for calling models on Oracle Cloud Infrastructure.
5151
- **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
5252

53-
54-
### 54% RPS Improvement
55-
56-
Throughput increased by 54% (1,040 → 1,602 RPS, aggregated) per instance while maintaining a 40 ms median overhead. The improvement comes from fixing major O(n²) inefficiencies in the router, primarily caused by repeated use of in statements inside loops over large arrays. Tests were run with a database-only setup (no cache hits). As a result, p95 latency improved by 30% (2,700 → 1,900 ms), enhancing overall stability and scalability under heavy load.
57-
58-
---
59-
60-
### Test Setup
61-
62-
All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.
63-
64-
**System Specs**
65-
66-
- **CPU:** 8 vCPUs
67-
- **Memory:** 32 GB RAM
68-
69-
**Configuration (config.yaml)**
70-
71-
View the complete configuration: [gist.github.com/AlexsanderHamir/config.yaml](https://gist.github.com/AlexsanderHamir/53f7d554a5d2afcf2c4edb5b6be68ff4)
72-
73-
**Load Script (no_cache_hits.py)**
74-
75-
View the complete load testing script: [gist.github.com/AlexsanderHamir/no_cache_hits.py](https://gist.github.com/AlexsanderHamir/42c33d7a4dc7a57f56a78b560dee3a42)
76-
7753
---
7854

7955
### Risk of Upgrade

docs/my-website/release_notes/v1.77.5-stable/index.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ authors:
1111
title: CTO, LiteLLM
1212
url: https://www.linkedin.com/in/reffajnaahsi/
1313
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
14+
- name: Alexsander Hamir
15+
title: Backend Performance Engineer
16+
url: https://www.linkedin.com/in/alexsander-baptista/
17+
image_url: https://media.licdn.com/dms/image/v2/D5603AQGXnziu4kqNCQ/profile-displayphoto-crop_800_800/B56ZkxEcuOKEAI-/0/1757464874550?e=1762387200&v=beta&t=9SNXLsWhx8OnYPAMQ9fqAr02oevDYEAL2vMYg2f9ieg
1418

1519
hide_table_of_contents: false
1620
---
@@ -49,7 +53,35 @@ pip install litellm==1.77.5
4953
- **MCP OAuth 2.0 Support** - Enhanced authentication for Model Context Protocol integrations
5054
- **Scheduled Key Rotations** - Automated key rotation capabilities for enhanced security
5155
- **New Gemini 2.5 Flash & Flash-lite Models** - Latest September 2025 preview models with improved pricing and features
52-
- **Performance Improvements** - Critical InMemoryCache unbounded growth resolution
56+
- **Performance Improvements** - 54% RPS improvement
57+
58+
---
59+
60+
### Performance Improvements - 54% RPS Improvement
61+
62+
Throughput increased by 54% (1,040 → 1,602 RPS, aggregated) per instance while maintaining a 40 ms median overhead. The improvement comes from fixing major O(n²) inefficiencies in the router, primarily caused by repeated use of in statements inside loops over large arrays. Tests were run with a database-only setup (no cache hits). As a result, p95 latency improved by 30% (2,700 → 1,900 ms), enhancing overall stability and scalability under heavy load.
63+
64+
---
65+
66+
### Test Setup
67+
68+
All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.
69+
70+
**System Specs**
71+
72+
- **CPU:** 8 vCPUs
73+
- **Memory:** 32 GB RAM
74+
75+
**Configuration (config.yaml)**
76+
77+
View the complete configuration: [gist.github.com/AlexsanderHamir/config.yaml](https://gist.github.com/AlexsanderHamir/53f7d554a5d2afcf2c4edb5b6be68ff4)
78+
79+
**Load Script (no_cache_hits.py)**
80+
81+
View the complete load testing script: [gist.github.com/AlexsanderHamir/no_cache_hits.py](https://gist.github.com/AlexsanderHamir/42c33d7a4dc7a57f56a78b560dee3a42)
82+
83+
---
84+
5385

5486
## New Models / Updated Models
5587

0 commit comments

Comments
 (0)