Skip to content

Commit 82d6c4d

Browse files
Merge branch 'main' into litellm_dev_08_31_2025_p1
2 parents f1f9f2a + 6d36219 commit 82d6c4d

File tree

63 files changed

+1834
-438
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+1834
-438
lines changed

.circleci/config.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1477,6 +1477,7 @@ jobs:
14771477
docker run -d \
14781478
-p 4000:4000 \
14791479
-e DATABASE_URL=$PROXY_DATABASE_URL \
1480+
-e DEFAULT_NUM_WORKERS_LITELLM_PROXY=1 \
14801481
-e DISABLE_SCHEMA_UPDATE="True" \
14811482
-v $(pwd)/litellm/proxy/example_config_yaml/bad_schema.prisma:/app/schema.prisma \
14821483
-v $(pwd)/litellm/proxy/example_config_yaml/bad_schema.prisma:/app/litellm/proxy/schema.prisma \
@@ -2962,6 +2963,7 @@ jobs:
29622963
command: |
29632964
docker run --name my-app \
29642965
-p 4000:4000 \
2966+
-e DEFAULT_NUM_WORKERS_LITELLM_PROXY=1 \
29652967
-e DATABASE_URL="postgresql://wrong:wrong@wrong:5432/wrong" \
29662968
myapp:latest \
29672969
--port 4000 > docker_output.log 2>&1 || true

deploy/charts/litellm-helm/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ type: application
1818
# This is the chart version. This version number should be incremented each time you make changes
1919
# to the chart and its templates, including the app version.
2020
# Versions are expected to follow Semantic Versioning (https://semver.org/)
21-
version: 0.4.5
21+
version: 0.4.6
2222

2323
# This is the version number of the application being deployed. This version number should be
2424
# incremented each time you make changes to the application. Versions are not expected to

deploy/charts/litellm-helm/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,11 @@ If `db.useStackgresOperator` is used (not yet implemented):
4141
| `proxyConfigMap.key` | Key in the ConfigMap that contains the proxy config file. | `"config.yaml"` |
4242
| `proxy_config.*` | See [values.yaml](./values.yaml) for default settings. Rendered into the ConfigMap’s `config.yaml` only when `proxyConfigMap.create=true`. See [example_config_yaml](../../../litellm/proxy/example_config_yaml/) for configuration examples. | `N/A` |
4343
| `extraContainers[]` | An array of additional containers to be deployed as sidecars alongside the LiteLLM Proxy.
44+
| `pdb.enabled` | Enable a PodDisruptionBudget for the LiteLLM proxy Deployment | `false` |
45+
| `pdb.minAvailable` | Minimum number/percentage of pods that must be available during **voluntary** disruptions (choose **one** of minAvailable/maxUnavailable) | `null` |
46+
| `pdb.maxUnavailable` | Maximum number/percentage of pods that can be unavailable during **voluntary** disruptions (choose **one** of minAvailable/maxUnavailable) | `null` |
47+
| `pdb.annotations` | Extra metadata annotations to add to the PDB | `{}` |
48+
| `pdb.labels` | Extra metadata labels to add to the PDB | `{}` |
4449

4550
#### Example `proxy_config` ConfigMap from values (default):
4651

deploy/charts/litellm-helm/templates/NOTES.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@
2020
echo "Visit http://127.0.0.1:8080 to use your application"
2121
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
2222
{{- end }}
23+
PDB: {{ if .Values.pdb.enabled }}enabled{{ else }}disabled{{ end }}. Configure via .Values.pdb.*
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
{{- /*
2+
PodDisruptionBudget for LiteLLM proxy
3+
Controlled via .Values.pdb.enabled and .Values.pdb.{minAvailable|maxUnavailable}
4+
Only one of minAvailable / maxUnavailable should be set. If both are set, minAvailable wins.
5+
*/ -}}
6+
{{- if .Values.pdb.enabled }}
7+
apiVersion: policy/v1
8+
kind: PodDisruptionBudget
9+
metadata:
10+
name: {{ include "litellm.fullname" . }}
11+
labels:
12+
{{- include "litellm.labels" . | nindent 4 }}
13+
{{- with .Values.pdb.labels }}
14+
{{- toYaml . | nindent 4 }}
15+
{{- end }}
16+
{{- with .Values.pdb.annotations }}
17+
annotations:
18+
{{- toYaml . | nindent 4 }}
19+
{{- end }}
20+
spec:
21+
selector:
22+
matchLabels:
23+
{{- /* Match the Deployment selector to target the same pod set */ -}}
24+
{{- include "litellm.selectorLabels" . | nindent 6 }}
25+
{{- if .Values.pdb.minAvailable }}
26+
minAvailable: {{ .Values.pdb.minAvailable }}
27+
{{- else if .Values.pdb.maxUnavailable }}
28+
maxUnavailable: {{ .Values.pdb.maxUnavailable }}
29+
{{- else }}
30+
# Safe default if enabled but not configured
31+
maxUnavailable: 1
32+
{{- end }}
33+
{{- end }}
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
suite: "pdb enabled"
2+
templates:
3+
- poddisruptionbudget.yaml
4+
tests:
5+
- it: "renders a PDB with maxUnavailable=1"
6+
set:
7+
pdb.enabled: true
8+
pdb.maxUnavailable: 1
9+
asserts:
10+
- hasDocuments: { count: 1 }
11+
- isKind: { of: PodDisruptionBudget }
12+
- equal: { path: apiVersion, value: policy/v1 }
13+
- equal: { path: spec.maxUnavailable, value: 1 }
14+
- equal:
15+
path: spec.selector.matchLabels
16+
value:
17+
app.kubernetes.io/name: litellm
18+
app.kubernetes.io/instance: RELEASE-NAME
19+
20+
---
21+
suite: "pdb disabled"
22+
templates:
23+
- poddisruptionbudget.yaml
24+
tests:
25+
- it: "does not render when disabled"
26+
set:
27+
pdb.enabled: false
28+
asserts:
29+
- hasDocuments: { count: 0 }
30+
31+
---
32+
suite: "pdb minAvailable precedence"
33+
templates:
34+
- poddisruptionbudget.yaml
35+
tests:
36+
- it: "uses minAvailable when both are set"
37+
set:
38+
pdb.enabled: true
39+
pdb.minAvailable: "50%"
40+
pdb.maxUnavailable: 1
41+
asserts:
42+
- isKind: { of: PodDisruptionBudget }
43+
- equal: { path: apiVersion, value: policy/v1 }
44+
- equal: { path: spec.minAvailable, value: "50%" }
45+
- isNull: { path: spec.maxUnavailable }

deploy/charts/litellm-helm/values.yaml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -240,4 +240,11 @@ extraEnvVars: {
240240
# value: EXTRA_ENV_VAR_VALUE
241241
}
242242

243-
243+
# Pod Disruption Budget
244+
pdb:
245+
enabled: false
246+
# Set exactly one of the following. If both are set, minAvailable takes precedence.
247+
minAvailable: null # e.g. "50%" or 1
248+
maxUnavailable: null # e.g. 1 or "20%"
249+
annotations: {}
250+
labels: {}

docs/my-website/docs/exception_mapping.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ All exceptions can be imported from `litellm` - e.g. `from litellm import BadReq
1212
| 400 | UnsupportedParamsError | litellm.BadRequestError | Raised when unsupported params are passed |
1313
| 400 | ContextWindowExceededError| litellm.BadRequestError | Special error type for context window exceeded error messages - enables context window fallbacks |
1414
| 400 | ContentPolicyViolationError| litellm.BadRequestError | Special error type for content policy violation error messages - enables content policy fallbacks |
15+
| 400 | ImageFetchError | litellm.BadRequestError | Raised when there are errors fetching or processing images |
1516
| 400 | InvalidRequestError | openai.BadRequestError | Deprecated error, use BadRequestError instead |
1617
| 401 | AuthenticationError | openai.AuthenticationError |
1718
| 403 | PermissionDeniedError | openai.PermissionDeniedError |

docs/my-website/docs/proxy/config_settings.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -431,6 +431,7 @@ router_settings:
431431
| DEFAULT_MOCK_RESPONSE_COMPLETION_TOKEN_COUNT | Default token count for mock response completions. Default is 20
432432
| DEFAULT_MOCK_RESPONSE_PROMPT_TOKEN_COUNT | Default token count for mock response prompts. Default is 10
433433
| DEFAULT_MODEL_CREATED_AT_TIME | Default creation timestamp for models. Default is 1677610602
434+
| DEFAULT_NUM_WORKERS_LITELLM_PROXY | Default number of workers for LiteLLM proxy. Default is 4. **We strongly recommend setting NUM Workers to Number of vCPUs available**
434435
| DEFAULT_PROMPT_INJECTION_SIMILARITY_THRESHOLD | Default threshold for prompt injection similarity. Default is 0.7
435436
| DEFAULT_POLLING_INTERVAL | Default polling interval for schedulers in seconds. Default is 0.03
436437
| DEFAULT_REASONING_EFFORT_DISABLE_THINKING_BUDGET | Default reasoning effort disable thinking budget. Default is 0

docs/my-website/docs/proxy/deploy.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,7 @@ To start using Litellm, run the following commands in a shell:
1212

1313
```bash
1414
# Get the code
15-
git clone https://github.com/BerriAI/litellm
16-
17-
# Go to folder
18-
cd litellm
15+
curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/docker-compose.yml
1916

2017
# Add the master key - you can change this after setup
2118
echo 'LITELLM_MASTER_KEY="sk-1234"' > .env

0 commit comments

Comments
 (0)