Skip to content

Commit 23cbb91

Browse files
committed
Update README.md
1 parent 37e9f33 commit 23cbb91

File tree

1 file changed

+21
-75
lines changed

1 file changed

+21
-75
lines changed

optillm/plugins/proxy/README.md

Lines changed: 21 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -48,82 +48,36 @@ routing:
4848
### 2. Start OptiLLM Server
4949
5050
```bash
51-
# Option A: Use proxy as default for ALL requests (recommended)
52-
optillm --approach proxy
53-
54-
# Option B: Start server normally (requires model prefix or extra_body)
51+
# Start server normally
5552
optillm
5653

5754
# With custom port
58-
optillm --approach proxy --port 8000
55+
optillm --port 8000
5956
```
6057

58+
> **Note**: The `--approach proxy` flag is not currently supported. Use the model prefix method below.
59+
6160
### 3. Usage Examples
6261

63-
#### When using `--approach proxy` (Recommended)
62+
#### Using Model Prefix (Currently the only working method)
6463
```bash
65-
# No need for "proxy-" prefix! The proxy handles all requests automatically
64+
# Use "proxy-" prefix to activate the proxy plugin
6665
curl -X POST http://localhost:8000/v1/chat/completions \
6766
-H "Content-Type: application/json" \
6867
-d '{
69-
"model": "gpt-4",
68+
"model": "proxy-gpt-4",
7069
"messages": [{"role": "user", "content": "Hello"}]
7170
}'
7271

7372
# The proxy will:
7473
# 1. Route to one of your configured providers
75-
# 2. Apply model mapping if configured
74+
# 2. Apply model mapping if configured
7675
# 3. Handle failover automatically
7776
```
7877

79-
#### Without `--approach proxy` flag
80-
```bash
81-
# Method 1: Use model prefix
82-
curl -X POST http://localhost:8000/v1/chat/completions \
83-
-H "Content-Type: application/json" \
84-
-d '{
85-
"model": "proxy-gpt-4",
86-
"messages": [{"role": "user", "content": "Hello"}]
87-
}'
88-
89-
# Method 2: Use extra_body
90-
curl -X POST http://localhost:8000/v1/chat/completions \
91-
-H "Content-Type: application/json" \
92-
-d '{
93-
"model": "gpt-4",
94-
"messages": [{"role": "user", "content": "Hello"}],
95-
"extra_body": {
96-
"optillm_approach": "proxy"
97-
}
98-
}'
99-
```
100-
101-
#### Proxy with Approach/Plugin
102-
```bash
103-
# Use MOA approach with proxy load balancing
104-
curl -X POST http://localhost:8000/v1/chat/completions \
105-
-H "Content-Type: application/json" \
106-
-d '{
107-
"model": "gpt-4",
108-
"messages": [{"role": "user", "content": "Solve this problem"}],
109-
"extra_body": {
110-
"optillm_approach": "proxy",
111-
"proxy_wrap": "moa"
112-
}
113-
}'
114-
115-
# Use memory plugin with proxy
116-
curl -X POST http://localhost:8000/v1/chat/completions \
117-
-H "Content-Type: application/json" \
118-
-d '{
119-
"model": "gpt-4",
120-
"messages": [{"role": "user", "content": "Remember this"}],
121-
"extra_body": {
122-
"optillm_approach": "proxy",
123-
"proxy_wrap": "memory"
124-
}
125-
}'
126-
```
78+
> **Known Issues**:
79+
> - `--approach proxy` flag: Not supported in command-line interface
80+
> - `extra_body` method: Currently broken due to parsing bug in server code
12781
12882
#### Combined Approaches
12983
```bash
@@ -136,6 +90,8 @@ curl -X POST http://localhost:8000/v1/chat/completions \
13690
}'
13791
```
13892

93+
> **Note**: The proxy wrapping functionality (`proxy_wrap`) is currently not accessible via the working model prefix method. This would require the `extra_body` approach which is currently broken.
94+
13995
## Configuration Reference
14096

14197
### Provider Configuration
@@ -203,7 +159,7 @@ providers:
203159
204160
### Model-Specific Routing
205161
206-
When using `--approach proxy`, the proxy automatically maps model names to provider-specific deployments:
162+
The proxy automatically maps model names to provider-specific deployments:
207163
208164
```yaml
209165
providers:
@@ -222,9 +178,9 @@ providers:
222178
# No model_map needed - uses model names as-is
223179
```
224180

225-
With this configuration and `optillm --approach proxy`:
226-
- Request for "gpt-4" → Azure uses "gpt-4-deployment-001", OpenAI uses "gpt-4"
227-
- Request for "gpt-3.5-turbo" → Azure uses "gpt-35-turbo-deployment", OpenAI uses "gpt-3.5-turbo"
181+
With this configuration and `proxy-gpt-4` model requests:
182+
- Request for "proxy-gpt-4" → Azure uses "gpt-4-deployment-001", OpenAI uses "gpt-4"
183+
- Request for "proxy-gpt-3.5-turbo" → Azure uses "gpt-35-turbo-deployment", OpenAI uses "gpt-3.5-turbo"
228184

229185
### Failover Configuration
230186

@@ -358,31 +314,21 @@ client = OpenAI(
358314
api_key="dummy" # Can be any string when using proxy
359315
)
360316
361-
# If server started with --approach proxy:
317+
# Use proxy with model prefix (currently the only working method)
362318
response = client.chat.completions.create(
363-
model="gpt-4", # No "proxy-" prefix needed!
319+
model="proxy-gpt-4", # Use "proxy-" prefix
364320
messages=[{"role": "user", "content": "Hello"}]
365321
)
366-
367-
# Or explicitly use proxy with another approach:
368-
response = client.chat.completions.create(
369-
model="gpt-4",
370-
messages=[{"role": "user", "content": "Hello"}],
371-
extra_body={
372-
"optillm_approach": "proxy",
373-
"proxy_wrap": "moa" # Proxy will route MOA's requests
374-
}
375-
)
376322
```
377323

378324
### With LangChain
379325
```python
380326
from langchain.llms import OpenAI
381327
382-
# If server started with --approach proxy:
328+
# Use proxy with model prefix
383329
llm = OpenAI(
384330
openai_api_base="http://localhost:8000/v1",
385-
model_name="gpt-4" # Proxy handles routing automatically
331+
model_name="proxy-gpt-4" # Use "proxy-" prefix
386332
)
387333
388334
response = llm("What is the meaning of life?")

0 commit comments

Comments
 (0)