| 📦 Distribution | 🔧 Project | 🚀 Activity |
|---|---|---|
|
|
|
|
This package contains the LangChain integration with LiteLLM. LiteLLM is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, etc.
For conceptual guides, tutorials, and examples on using these classes, see the LangChain Docs.
Embeddings
Supported in v0.6.1+
Use LiteLLMEmbeddings to embed text across 100+ providers with a single, consistent interface. All configuration is explicit -- no environment variables required.
from langchain_litellm import LiteLLMEmbeddings
embeddings = LiteLLMEmbeddings(
model="openai/text-embedding-3-small",
api_key="sk-...",
)
vectors = embeddings.embed_documents(["hello", "world"])
query_vector = embeddings.embed_query("hello")Switch providers by changing model -- the interface stays the same:
# Cohere
embeddings = LiteLLMEmbeddings(
model="cohere/embed-english-v3.0",
api_key="...",
document_input_type="search_document",
query_input_type="search_query",
)
# Azure OpenAI
embeddings = LiteLLMEmbeddings(
model="azure/my-embedding-deployment",
api_key="...",
api_base="https://my-resource.openai.azure.com",
api_version="2024-02-01",
)
# Bedrock
embeddings = LiteLLMEmbeddings(
model="bedrock/amazon.titan-embed-text-v1",
)For load-balancing across multiple deployments of the same model, use LiteLLMEmbeddingsRouter:
from litellm import Router
from langchain_litellm import LiteLLMEmbeddingsRouter
router = Router(model_list=[
{
"model_name": "text-embedding-3-small",
"litellm_params": {
"model": "openai/text-embedding-3-small",
"api_key": "sk-key1",
},
},
{
"model_name": "text-embedding-3-small",
"litellm_params": {
"model": "openai/text-embedding-3-small",
"api_key": "sk-key2",
},
},
])
embeddings = LiteLLMEmbeddingsRouter(router=router)Vertex AI Grounding (Google Search)
Supported in v0.3.5+
You can use Google Search grounding with Vertex AI models (e.g., gemini-2.5-flash). Citations and metadata are returned in response_metadata (Batch) or additional_kwargs (Streaming).
Setup
import os
from langchain_litellm import ChatLiteLLM
os.environ["VERTEX_PROJECT"] = "your-project-id"
os.environ["VERTEX_LOCATION"] = "us-central1"
llm = ChatLiteLLM(model="vertex_ai/gemini-2.5-flash", temperature=0)Batch Usage
# Invoke with Google Search tool enabled
response = llm.invoke(
"What is the current stock price of Google?",
tools=[{"googleSearch": {}}]
)
# Access Citations & Metadata
provider_fields = response.response_metadata.get("provider_specific_fields")
if provider_fields:
# Vertex returns a list; the first item contains the grounding info
print(provider_fields[0])Streaming Usage
stream = llm.stream(
"What is the current stock price of Google?",
tools=[{"googleSearch": {}}]
)
for chunk in stream:
print(chunk.content, end="", flush=True)
# Metadata is injected into the chunk where it arrives
if "provider_specific_fields" in chunk.additional_kwargs:
print("\n[Metadata Found]:", chunk.additional_kwargs["provider_specific_fields"])See our Releases and Versioning policies.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see the Contributing Guide.