Skip to content

Commit 8c49b64

Browse files
image added
1 parent cde0a9c commit 8c49b64

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

cloud-infrastructure/ai-infra-gpu/ai-infrastructure/litellm/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
In this tutorial we explain how to use a LiteLLM Proxy Server to call multiple LLM inference endpoints from a single interface. LiteLLM interacts will 100+ LLMs such as OpenAI, Coheren, NVIDIA Triton and NIM, etc. Here we will use two vLLM inference servers.
44

5+
![Hybrid shards](assets/images/litellm.avif "LiteLLM")
6+
57
## Introduction
68

79
LiteLLM provides a proxy server to manage auth, loadbalancing, and spend tracking across 100+ LLMs. All in the OpenAI format.

0 commit comments

Comments
 (0)