diff --git a/integrations/libraries/langchain-js.mdx b/integrations/libraries/langchain-js.mdx index 8c70323a..175975cc 100644 --- a/integrations/libraries/langchain-js.mdx +++ b/integrations/libraries/langchain-js.mdx @@ -1,268 +1,464 @@ --- title: "Langchain (JS/TS)" -description: "Portkey adds core production capabilities to any Langchain app." +description: "Add Portkey's enterprise features to any Langchain app—observability, reliability, caching, and cost control." --- -This guide covers the integration for the **Javascript / Typescript** flavour of Langchain. Docs for the Python Langchain integration are [here](/integrations/libraries/langchain-python). +This guide covers Langchain **JavaScript/TypeScript**. For Python, see [Langchain Python](/integrations/libraries/langchain-python). -**LangChain** is a framework for developing applications powered by language models. It enables applications that: +Langchain provides a unified interface for building LLM applications. Add Portkey to get production-grade features: full observability, automatic fallbacks, semantic caching, and cost controls—all without changing your Langchain code. -* **Are context-aware**: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.) -* **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.) +## Quick Start -You can find more information about it [here](https://python.langchain.com/docs/tutorials/). +Add Portkey to any Langchain app with 3 parameters: -When using Langchain, **Portkey can help take it to production** by adding a fast AI gateway, observability, prompt management and more to your Langchain app. +```javascript +import { ChatOpenAI } from "@langchain/openai"; -## Quick Start Integration +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", // Provider slug from Model Catalog + configuration: { + baseURL: "https://api.portkey.ai/v1" + }, + apiKey: "PORTKEY_API_KEY" // Your Portkey API key +}); -Install the Portkey and Langchain SDKs to get started. +const response = await model.invoke("Tell me a joke"); +console.log(response.content); +``` -```sh -npm install langchain portkey-ai @langchain/openai + + + + +That's it! You now get: +- ✅ Full observability (costs, latency, logs) +- ✅ Dynamic model selection per request +- ✅ Automatic fallbacks and retries (via configs) +- ✅ Budget controls per team/project + +## Why Add Portkey to Langchain? + +Langchain handles application orchestration. Portkey adds production features: + + + + Every request logged with costs, latency, tokens. Team-level analytics and debugging. + + + Switch models per request. Route simple queries to cheap models, complex to advanced—automatically tracked. + + + Automatic fallbacks, smart retries, load balancing—configured once, works everywhere. + + + Budget limits per team/project. Rate limiting. Centralized credential management. + + + +## Setup + +### 1. Install Packages + +```bash +npm install @langchain/openai portkey-ai ``` - -Since Portkey is fully compatible with the OpenAI signature, you can connect to the Portkey Ai Gateway through the `ChatOpenAI` interface. +### 2. Add Provider in Model Catalog -* Set the `baseURL` as `PORTKEY_GATEWAY_URL` -* Add `defaultHeaders` to consume the headers needed by Portkey using the `createHeaders` helper method. - +1. Go to [**Model Catalog → Add Provider**](https://app.portkey.ai/model-catalog/providers) +2. Select your provider (OpenAI, Anthropic, Google, etc.) +3. Choose existing credentials or create new by entering your API keys +4. Name your provider (e.g., `openai-prod`) -We can now initialise the model and update the model to use Portkey's AI gateway +Your provider slug will be **`@openai-prod`** (or whatever you named it). -```js -import { ChatOpenAI } from "@langchain/openai"; -import { createHeaders, PORTKEY_GATEWAY_URL} from "portkey-ai" + + Set up budgets, rate limits, and manage credentials + -const PORTKEY_API_KEY = "..." -const PROVIDER_API_KEY = "..." // Add the API key of the AI provider being used +### 3. Get Portkey API Key -const portkeyConf = { - baseURL: PORTKEY_GATEWAY_URL, - defaultHeaders: createHeaders({apiKey: PORTKEY_API_KEY, provider: "openai"}) -} +Create your Portkey API key at [app.portkey.ai/api-keys](https://app.portkey.ai/api-keys) -const chatModel = new ChatOpenAI({ - apiKey: PROVIDER_API_KEY, - configuration: portkeyConf +### 4. Use in Your Code + +Replace your existing `ChatOpenAI` initialization: + +```javascript +// Before (direct to OpenAI) +const model = new ChatOpenAI({ + model: "gpt-4o", + apiKey: "OPENAI_API_KEY" }); -const response = await chatModel.invoke("What is the meaning of life, universe and everything?"); -console.log("Response:", response); +// After (via Portkey) +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { + baseURL: "https://api.portkey.ai/v1" + }, + apiKey: "PORTKEY_API_KEY" +}); ``` -Response - - - -```js -AIMessage { - lc_serializable: true, - lc_kwargs: { - content: `The phrase "the meaning of life, universe, and everything" is a reference to Douglas Adams' science fiction series, "The Hitchhiker's Guide to the Galaxy." In the series, a supercomputer called Deep Thought was asked to calculate the Answer to the Ultimate Question of Life, the Universe, and Everything. After much deliberation, Deep Thought revealed that the answer was simply the number 42.\n` + - '\n' + - 'In the context of the series, the number 42 is meant to highlight the absurdity and unknowability of the ultimate meaning of life and the universe. It is a humorous and satirical take on the deep philosophical questions that have puzzled humanity for centuries.\n' + - '\n' + - 'Ultimately, the meaning of life, universe, and everything is a complex and deeply personal question that each individual must grapple with and find their own answer to. It may be different for each person and can encompass a wide range of beliefs, values, and experiences.', - tool_calls: [], - invalid_tool_calls: [], - additional_kwargs: { function_call: undefined, tool_calls: undefined }, - response_metadata: {} - }, - lc_namespace: [ 'langchain_core', 'messages' ], - content: `The phrase "the meaning of life, universe, and everything" is a reference to Douglas Adams' science fiction series, "The Hitchhiker's Guide to the Galaxy." In the series, a supercomputer called Deep Thought was asked to calculate the Answer to the Ultimate Question of Life, the Universe, and Everything. After much deliberation, Deep Thought revealed that the answer was simply the number 42.\n` + - '\n' + - 'In the context of the series, the number 42 is meant to highlight the absurdity and unknowability of the ultimate meaning of life and the universe. It is a humorous and satirical take on the deep philosophical questions that have puzzled humanity for centuries.\n' + - '\n' + - 'Ultimately, the meaning of life, universe, and everything is a complex and deeply personal question that each individual must grapple with and find their own answer to. It may be different for each person and can encompass a wide range of beliefs, values, and experiences.', - name: undefined, - additional_kwargs: { function_call: undefined, tool_calls: undefined }, - response_metadata: { - tokenUsage: { completionTokens: 186, promptTokens: 18, totalTokens: 204 }, - finish_reason: 'stop' - }, - tool_calls: [], - invalid_tool_calls: [] -} -``` +**That's the only change needed!** All your existing Langchain code (agents, chains, LCEL, etc.) works exactly the same. -The call and the corresponding prompt will also be visible on the Portkey logs tab. +## Switching Between Providers - - - +Just change the model string—everything else stays the same: -## Using Virtual Keys for Multiple Models +```javascript +// OpenAI +const openaiModel = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" +}); -Portkey supports [Virtual Keys](/product/ai-gateway/virtual-keys) which are an easy way to store and manage API keys in a secure vault. Lets try using a Virtual Key to make LLM calls. +// Anthropic +const anthropicModel = new ChatOpenAI({ + model: "@anthropic-prod/claude-sonnet-4", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" +}); -#### 1\. Create a Virtual Key in your Portkey account and the id +// Google Gemini +const geminiModel = new ChatOpenAI({ + model: "@google-prod/gemini-2.0-flash", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" +}); +``` -Let's try creating a new Virtual Key for Mistral like this + +Portkey implements OpenAI-compatible APIs for all providers, so you always use `ChatOpenAI` regardless of which model you're calling. + - - - +## Using with Langchain Agents -#### 2\. Use Virtual Keys in the Portkey Headers +Langchain agents are the primary use case. Portkey works seamlessly with agent workflows: -The `virtualKey` parameter sets the authentication and provider for the AI provider being used. In our case we're using the Mistral Virtual key. +```javascript +import { ChatOpenAI } from "@langchain/openai"; +import { tool } from "@langchain/core/tools"; +import { createReactAgent } from "@langchain/langgraph/prebuilt"; +import { z } from "zod"; + +// Define tools +const searchTool = tool( + async ({ query }) => `Results for: ${query}`, + { + name: "search", + description: "Search for information", + schema: z.object({ query: z.string() }) + } +); + +const weatherTool = tool( + async ({ location }) => `Weather in ${location}: Sunny, 72°F`, + { + name: "get_weather", + description: "Get weather for a location", + schema: z.object({ location: z.string() }) +} +); - -Notice that the `apiKey` can be left blank as that authentication won't be used. - +// Create model with Portkey +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" +}); +// Create agent +const agent = createReactAgent({ + llm: model, + tools: [searchTool, weatherTool] +}); +// Run agent +const result = await agent.invoke({ + messages: [{ role: "user", content: "What's the weather in NYC?" }] +}); +``` -```js -import { ChatOpenAI } from "@langchain/openai"; -import { createHeaders, PORTKEY_GATEWAY_URL} from "portkey-ai" +Every agent step is logged in Portkey: +- Model calls with prompts and responses +- Tool executions with inputs and outputs +- Full trace of the agent's reasoning +- Costs and latency for each step -const PORTKEY_API_KEY = "..." -const MISTRAL_VK = "..." // Add the virtual key for mistral that we just created + + + -const portkeyConf = { - baseURL: PORTKEY_GATEWAY_URL, - defaultHeaders: createHeaders({apiKey: PORTKEY_API_KEY, virtualKey: MISTRAL_VK}) -} +## Works With All Langchain Features + +✅ **Agents** - Full compatibility with LangGraph agents +✅ **LCEL** - LangChain Expression Language +✅ **Chains** - All chain types supported +✅ **Streaming** - Token-by-token streaming +✅ **Tool Calling** - Function/tool calling +✅ **LangGraph** - Complex workflows + +### Streaming -const chatModel = new ChatOpenAI({ - apiKey: "X", - configuration: portkeyConf, - model: "mistral-large-latest" +```javascript +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY", + streaming: true }); -const response = await chatModel.invoke("What is the meaning of life, universe and everything?"); -console.log("Response:", response); +const stream = await model.stream("Write a short story"); +for await (const chunk of stream) { + process.stdout.write(chunk.content); +} ``` -The Portkey AI gateway will authenticate the API request to Mistral and get the response back in the OpenAI format for you to consume. - -The AI gateway extends Langchain's `ChatOpenAI` class making it a single interface to call any provider and any model. +### Chains & Prompts -## Embeddings +```javascript +import { ChatOpenAI } from "@langchain/openai"; +import { ChatPromptTemplate } from "@langchain/core/prompts"; -Embeddings in Langchain through Portkey work the same way as the Chat Models using the `OpenAIEmbeddings` class. Let's try to create an embedding using OpenAI's embedding model +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" +}); -```js -import { OpenAIEmbeddings } from "@langchain/openai"; -import { createHeaders, PORTKEY_GATEWAY_URL} from "portkey-ai" +const prompt = ChatPromptTemplate.fromMessages([ + ["human", "Tell me a short joke about {topic}"] +]); -const PORTKEY_API_KEY = "..."; -const OPENAI_VK = "..." // Add OpenAI's Virtual Key created in Portkey +const chain = prompt.pipe(model); +const response = await chain.invoke({ topic: "ice cream" }); +console.log(response.content); +``` -const portkeyConf = { - baseURL: PORTKEY_GATEWAY_URL, - defaultHeaders: createHeaders({apiKey: PORTKEY_API_KEY, virtualKey: OPENAI_VK}) -} +### Tool Calling -/* Create instance */ -const embeddings = new OpenAIEmbeddings({ - apiKey: "X", - configuration: portkeyConf, +```javascript +import { ChatOpenAI } from "@langchain/openai"; +import { z } from "zod"; +import { tool } from "@langchain/core/tools"; + +const getWeather = tool( + async ({ location }) => `Weather in ${location}: Sunny, 72°F`, + { + name: "get_weather", + description: "Get current weather in a location", + schema: z.object({ + location: z.string().describe("City and state, e.g. San Francisco, CA") + }) + } +); + +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { baseURL: "https://api.portkey.ai/v1" }, + apiKey: "PORTKEY_API_KEY" }); -/* Embed queries */ -const res = await embeddings.embedQuery("Hello world"); -console.log("Response:", res); +const modelWithTools = model.bindTools([getWeather]); +const response = await modelWithTools.invoke("What's the weather in NYC?"); +console.log(response.tool_calls); ``` -## Chains & Prompts +## Dynamic Model Selection -[Chains](https://python.langchain.com/docs/modules/chains/) enable the integration of various Langchain concepts for simultaneous execution while Langchain supports [Prompt Templates](https://python.langchain.com/docs/modules/model%5Fio/prompts/) to construct inputs for language models. Lets see how this would work with Portkey +For dynamic model routing based on query complexity or task type, use **Portkey Configs** with conditional routing: -```js +```javascript import { ChatOpenAI } from "@langchain/openai"; -import { ChatPromptTemplate } from "@langchain/core/prompts"; -import { createHeaders, PORTKEY_GATEWAY_URL} from "portkey-ai" +import { createHeaders } from "portkey-ai"; -const PORTKEY_API_KEY = "..."; -const OPENAI_VK = "..." // Add OpenAI's Virtual Key created in Portkey - -const portkeyConf = { - baseURL: PORTKEY_GATEWAY_URL, - defaultHeaders: createHeaders({apiKey: PORTKEY_API_KEY, virtualKey: OPENAI_VK}) -} +// Define routing config (created in Portkey dashboard) +const config = { + strategy: { + mode: "conditional", + conditions: [ + { + query: { "metadata.complexity": { "$eq": "simple" } }, + then: "cheap-model" + }, + { + query: { "metadata.complexity": { "$eq": "complex" } }, + then: "advanced-model" + } + ], + default: "cheap-model" + }, + targets: [ + { + name: "cheap-model", + override_params: { model: "@openai-prod/gpt-4o-mini" } + }, + { + name: "advanced-model", + override_params: { model: "@openai-prod/o1" } + } + ] +}; + +const model = new ChatOpenAI({ + model: "gpt-4o", + configuration: { + baseURL: "https://api.portkey.ai/v1", + defaultHeaders: createHeaders({ config }) + }, + apiKey: "PORTKEY_API_KEY" +}); -// Initialise the chat model -const chatModel = new ChatOpenAI({ - apiKey: "X", - configuration: portkeyConf +// Route to cheap model +const response1 = await model.invoke("What is 2+2?", { + metadata: { complexity: "simple" } }); -// Define the chat prompt template -const prompt = ChatPromptTemplate.fromMessages([ - ["human", "Tell me a short joke about {topic}"], -]); +// Route to advanced model +const response2 = await model.invoke("Solve this differential equation...", { + metadata: { complexity: "complex" } +}); +``` -// Invoke the chain with the prompt and chat model -const chain = prompt.pipe(chatModel); -const res = await chain.invoke({ topic: "ice cream" }); +All routing decisions are tracked in Portkey with full observability—see which models were used, costs per model, and performance comparisons. -console.log(res) -``` + + Learn more about conditional routing and advanced patterns + -We'd be able to view the exact prompt that was used to make the call to OpenAI in the Portkey logs dashboards. +## Advanced Features via Configs -## Using Advanced Routing +For production features like fallbacks, caching, and load balancing, use Portkey Configs: -The Portkey AI Gateway brings capabilities like load-balancing, fallbacks, experimentation and canary testing to Langchain through a configuration-first approach. +```javascript +import { ChatOpenAI } from "@langchain/openai"; +import { createHeaders } from "portkey-ai"; + +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", + configuration: { + baseURL: "https://api.portkey.ai/v1", + defaultHeaders: createHeaders({ + config: "pc_your_config_id" // Created in Portkey dashboard + }) + }, + apiKey: "PORTKEY_API_KEY" +}); +``` -Let's take an **example** where we might want to split traffic between gpt-4 and claude-opus 50:50 to test the two large models. The gateway configuration for this would look like the following: +### Example: Load Balancing -```js +```javascript const config = { - "strategy": { - "mode": "loadbalance" + strategy: { mode: "loadbalance" }, + targets: [ + { + override_params: { model: "@openai-prod/gpt-4o" }, + weight: 0.5 + }, + { + override_params: { model: "@anthropic-prod/claude-sonnet-4" }, + weight: 0.5 + } + ] +}; + +const model = new ChatOpenAI({ + model: "gpt-4o", + configuration: { + baseURL: "https://api.portkey.ai/v1", + defaultHeaders: createHeaders({ config }) }, - "targets": [{ - "virtual_key": OPENAI_VK, // OpenAI's virtual key - "override_params": {"model": "gpt4"}, - "weight": 0.5 - }, { - "virtual_key": ANTHROPIC_VK, // Anthropic's virtual key - "override_params": {"model": "claude-3-opus-20240229"}, - "weight": 0.5 - }] -} + apiKey: "PORTKEY_API_KEY" +}); + +// Requests are distributed 50/50 between OpenAI and Anthropic +const response = await model.invoke("Hello!"); ``` -We can then use this `config` in our requests being made from langchain. + + Set up fallbacks, retries, caching, load balancing, and more + +## Embeddings +Create embeddings via Portkey: -```js -const portkeyConf = { - baseURL: PORTKEY_GATEWAY_URL, - defaultHeaders: createHeaders({apiKey: PORTKEY_API_KEY, config: config}) -} +```javascript +import { OpenAIEmbeddings } from "@langchain/openai"; -const chatModel = new ChatOpenAI({ - apiKey: "X", - configuration: portkeyConf, - maxTokens: 100 +const embeddings = new OpenAIEmbeddings({ + model: "text-embedding-3-small", + configuration: { + baseURL: "https://api.portkey.ai/v1", + defaultHeaders: { "x-portkey-provider": "@openai-prod" } + }, + apiKey: "PORTKEY_API_KEY" }); -const res = await chatModel.invoke("What is the meaning of life, universe and everything?") +const vectors = await embeddings.embedDocuments(["Hello world", "Goodbye world"]); +console.log(vectors); ``` -When the LLM is invoked, Portkey will distribute the requests to `gpt-4` and `claude-3-opus-20240229` in the ratio of the defined weights. - -You can find more config examples [here](/api-reference/config-object#examples). + +Portkey supports OpenAI embeddings via `OpenAIEmbeddings`. For other providers (Cohere, Voyage), use the **Portkey SDK directly** ([docs](/api-reference/inference-api/embeddings)). + -## Agents & Tracing +## Migration from Direct OpenAI -A powerful capability of Langchain is creating Agents. The challenge with agentic workflows is that prompts are often abstracted out and it's hard to get a visibility into what the agent is doing. This also makes debugging harder. +Already using Langchain with OpenAI? Just update 3 parameters: -Connect the Portkey configuration to the `ChatOpenAI` model and we'd be able to use all the benefits of the AI gateway as shown above. +```javascript +// Before +import { ChatOpenAI } from "@langchain/openai"; -Also, Portkey would capture the logs from the agent API calls giving us full visibility. +const model = new ChatOpenAI({ + model: "gpt-4o", + apiKey: process.env.OPENAI_API_KEY, + temperature: 0.7 +}); - - - +// After (add configuration, change model and apiKey) +const model = new ChatOpenAI({ + model: "@openai-prod/gpt-4o", // Add provider slug + configuration: { + baseURL: "https://api.portkey.ai/v1" // Add this + }, + apiKey: "PORTKEY_API_KEY", // Change to Portkey key + temperature: 0.7 // Keep existing params +}); +``` -This is extremely powerful since we gain control and visibility over the agent flows so we can identify problems and make updates as needed. +**Benefits:** +- Zero code changes to your existing Langchain logic +- Instant observability for all requests +- Production-grade reliability features +- Cost controls and budgets + +## Next Steps + + + + Set up providers, budgets, and access control + + + Configure fallbacks, caching, and routing + + + Track costs, performance, and usage + + + Add PII detection and content filtering + + + +For complete SDK documentation: + + + Complete Portkey SDK documentation + diff --git a/integrations/libraries/langchain-python.mdx b/integrations/libraries/langchain-python.mdx index 11960a71..a394e792 100644 --- a/integrations/libraries/langchain-python.mdx +++ b/integrations/libraries/langchain-python.mdx @@ -1,417 +1,513 @@ --- title: "Langchain (Python)" -description: "Supercharge Langchain apps with Portkey: Multi-LLM, observability, caching, reliability, and prompt management." +description: "Add Portkey's enterprise features to any Langchain app—observability, reliability, caching, and cost control." --- This guide covers Langchain **Python**. For JS, see [Langchain JS](/integrations/libraries/langchain-js). -Portkey extends Langchain's `ChatOpenAI` to effortlessly work with **1600+ LLMs** (Anthropic, Gemini, Mistral, etc.) without needing different SDKs. Portkey enhances your Langchain apps with interoperability, reliability, speed, cost-efficiency, and deep observability. +Langchain provides a unified interface for building LLM applications. Add Portkey to get production-grade features: full observability, automatic fallbacks, semantic caching, and cost controls—all without changing your Langchain code. -## Getting Started +## Quick Start -Integrate Portkey into Langchain easily. - -### 1. Install Packages - -```sh -pip install -U langchain-openai portkey-ai -``` - - -`langchain-openai` includes `langchain-core`. Install `langchain` or other specific packages if you need more components. - - -### 2. Basic Setup: `ChatOpenAI` with Portkey - -Configure `ChatOpenAI` to route requests via Portkey using your Portkey API Key and `createHeaders` method. +Add Portkey to any Langchain app with 3 parameters: ```python from langchain_openai import ChatOpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import os - -# Use environment variables for API keys -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -PROVIDER_API_KEY = os.environ.get("OPENAI_API_KEY") # Example: OpenAI API Key - -# Configure Portkey headers -portkey_headers = createHeaders( - api_key=PORTKEY_API_KEY, - provider="openai" # Specify target LLM provider -) -llm = ChatOpenAI( - api_key=PROVIDER_API_KEY, # Provider's API key - base_url=PORTKEY_GATEWAY_URL, # Route via Portkey - default_headers=portkey_headers, # Portkey specific headers - model="gpt-4o" # Specify provider model +model = ChatOpenAI( + model="@openai-prod/gpt-4o", # Provider slug from Model Catalog + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" # Your Portkey API key ) -# response = llm.invoke("Tell me a joke about AI.") -# print(response.content) +response = model.invoke("Tell me a joke") +print(response.content) ``` -**Key `ChatOpenAI` Parameters:** - -* `api_key`: The underlying provider's API key. -* `base_url`: Set to `PORTKEY_GATEWAY_URL` to route via Portkey. -* `default_headers`: Uses `createHeaders` for your `PORTKEY_API_KEY`. Can also include a `virtual_key` (for provider credentials) or a `config` ID (for advanced routing). - -All LLM calls via this `llm` instance now use Portkey, starting with observability. - - + -This setup enables Portkey's advanced features for Langchain. +That's it! You now get: +- ✅ Full observability (costs, latency, logs) +- ✅ Dynamic model selection per request +- ✅ Automatic fallbacks and retries (via configs) +- ✅ Budget controls per team/project -## Key Portkey Features for Langchain +## Why Add Portkey to Langchain? -Routing Langchain requests via Portkey's `ChatOpenAI` interface unlocks powerful capabilities: +Langchain handles application orchestration. Portkey adds production features: - -

Use `ChatOpenAI` for OpenAI, Anthropic, Gemini, Mistral, and more. Switch providers easily with Virtual Keys or Configs.

+ + Every request logged with costs, latency, tokens. Team-level analytics and debugging. - -

Reduce latency and costs with Portkey's Simple, Semantic, or Hybrid caching, enabled via Configs.

+ + Switch models per request. Route simple queries to cheap models, complex to advanced—automatically tracked. - -

Build robust apps with retries, timeouts, fallbacks, and load balancing, configured in Portkey.

+ + Automatic fallbacks, smart retries, load balancing—configured once, works everywhere. - -

Get deep insights: LLM usage, costs, latency, and errors are automatically logged in Portkey.

-
- -

Manage, version, and use prompts from Portkey's Prompt Library within Langchain.

-
- -

Securely manage LLM provider API keys using Portkey Virtual Keys in your Langchain setup.

+ + Budget limits per team/project. Rate limiting. Centralized credential management.
---- +## Setup -## 1. Multi-LLM Integration +### 1. Install Packages -Portkey simplifies using different LLM providers. `ChatOpenAI` becomes your universal client for numerous models. +```bash +pip install langchain-openai portkey-ai +``` -**Mechanism: Portkey Headers with Virtual Keys** +### 2. Add Provider in Model Catalog -Switch LLMs by changing the `virtual_key` in `createHeaders` and the `model` in `ChatOpenAI`. Portkey manages provider specifics. +1. Go to [**Model Catalog → Add Provider**](https://app.portkey.ai/model-catalog/providers) +2. Select your provider (OpenAI, Anthropic, Google, etc.) +3. Choose existing credentials or create new by entering your API keys +4. Name your provider (e.g., `openai-prod`) -### Example: Anthropic (Claude) +Your provider slug will be **`@openai-prod`** (or whatever you named it). -1. **Create Anthropic Virtual Key:** In Portkey, add your Anthropic API key and get the Virtual Key ID. + + Set up budgets, rate limits, and manage credentials + -2. **Update Langchain code:** +### 3. Get Portkey API Key -```python -from langchain_openai import ChatOpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import os +Create your Portkey API key at [app.portkey.ai/api-keys](https://app.portkey.ai/api-keys) -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -ANTHROPIC_PROVIDER = os.environ.get("ANTHROPIC_PROVIDER") +### 4. Use in Your Code -portkey_anthropic_headers = createHeaders( - api_key=PORTKEY_API_KEY, - provider="@anthropic" -) +Replace your existing `ChatOpenAI` initialization: -llm_claude = ChatOpenAI( - api_key="placeholder_anthropic_key", # Placeholder; Portkey uses Virtual Key - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_anthropic_headers, - model="claude-3-5-sonnet-latest" # Anthropic model +```python +# Before (direct to OpenAI) +model = ChatOpenAI( + model="gpt-4o", + api_key="OPENAI_API_KEY" ) -# response_claude = llm_claude.invoke("Haiku principles?") -# print(response_claude.content) +# After (via Portkey) +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) ``` -### Example: Vertex (Gemini) +**That's the only change needed!** All your existing Langchain code (agents, chains, LCEL, etc.) works exactly the same. -1. **Create Google Virtual Key:** In Portkey, add your [Vertex AI credentials](/integrations/llms/vertex-ai#how-to-find-your-google-vertex-project-details) and get the Virtual Key ID. +## Switching Between Providers -2. **Update Langchain code:** +Just change the model string—everything else stays the same: ```python -from langchain_openai import ChatOpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import os - -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -GOOGLE_PROVIDER = os.environ.get("GOOGLE_PROVIDER") - -portkey_gemini_headers = createHeaders( - api_key=PORTKEY_API_KEY, - provider="@google" +# OpenAI +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -llm_gemini = ChatOpenAI( - api_key="placeholder_google_key", # Placeholder; Portkey uses Virtual Key - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_gemini_headers, - model="gemini-2.5-pro-preview" # Gemini model +# Anthropic +model = ChatOpenAI( + model="@anthropic-prod/claude-sonnet-4", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -# response_gemini = llm_gemini.invoke("Zero-knowledge proofs?") -# print(response_gemini.content) +# Google Gemini +model = ChatOpenAI( + model="@google-prod/gemini-2.0-flash", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) ``` -Core `ChatOpenAI` structure remains. Only `virtual_key` and `model` change. Portkey maps responses to the OpenAI format Langchain expects. This extends to Mistral, Cohere, Azure, Bedrock via their Virtual Keys. + +Portkey implements OpenAI-compatible APIs for all providers, so you always use `ChatOpenAI` regardless of which model you're calling. + ---- +## Using with Langchain Agents -## 2. Advanced Caching +Langchain agents are the primary use case. Portkey works seamlessly with `create_agent`: -Portkey's caching reduces latency and LLM costs. Enable it via a Portkey [Config object](/api-reference/config-object) or a saved Config ID in `createHeaders`. +```python +from langchain.agents import create_agent +from langchain.tools import tool +from langchain_openai import ChatOpenAI -A Portkey Config can specify `mode` (`simple`, `semantic`, `hybrid`) and `max_age` (cache duration). +@tool +def search(query: str) -> str: + """Search for information.""" + return f"Results for: {query}" -### Example: Semantic Caching +@tool +def get_weather(location: str) -> str: + """Get weather for a location.""" + return f"Weather in {location}: Sunny, 72°F" -1. **Define/Save Portkey Config:** Create a Config in Portkey (e.g., `langchain-semantic-cache`) specifying caching strategy. +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) - ```json - // Example Portkey Config JSON for semantic cache - { - "cache": { "mode": "semantic", "max_age": "2h" }, - "provider":"@your_openai_virtual_key_id", - "override_params": { "model": "gpt-4o" } - } - ``` - Assume saved Config ID is `cfg_semantic_cache_123`. +agent = create_agent(model, tools=[search, get_weather]) -2. **Use Config ID in `createHeaders`:** +result = agent.invoke({ + "messages": [{"role": "user", "content": "What's the weather in NYC and search for AI news"}] +}) +``` -```python -from langchain_openai import ChatOpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import os +Every agent step is logged in Portkey: +- Model calls with prompts and responses +- Tool executions with inputs and outputs +- Full trace of the agent's reasoning +- Costs and latency for each step -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -PORTKEY_CONFIG_ID = "cfg_semantic_cache_123" + + + -portkey_cached_headers = createHeaders( - api_key=PORTKEY_API_KEY, - provider="@openai" -) +## Works With All Langchain Features -llm_cached = ChatOpenAI( - api_key="placeholder_key", # Config can handle auth via virtual_key - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_cached_headers -) +✅ **Agents** - Full compatibility with `create_agent` +✅ **LCEL** - LangChain Expression Language +✅ **Chains** - All chain types supported +✅ **Streaming** - Token-by-token streaming +✅ **Tool Calling** - Function/tool calling +✅ **LangGraph** - Complex workflows -# response1 = llm_cached.invoke("Capital of France?") -# response2 = llm_cached.invoke("Tell me France's capital.") -``` +### Streaming -Similar requests can now hit the cache. Monitor cache performance in your Portkey dashboard. +```python +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY", + streaming=True +) ---- +for chunk in model.stream("Write a short story"): + print(chunk.content, end="", flush=True) +``` -## 3. Enhanced Reliability +### Tool Calling -Portkey improves Langchain app resilience via Configs: +```python +from pydantic import BaseModel, Field -* **Retries:** Auto-retry failed LLM requests. -* **Fallbacks:** Define backup LLMs if a primary fails. -* **Load Balancing:** Distribute requests across keys or models. -* **Timeouts:** Set max request durations. +class GetWeather(BaseModel): + '''Get current weather in a location''' + location: str = Field(..., description="City and state, e.g. San Francisco, CA") -### Example: Fallbacks with Retries +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) -1. **Define/Save Portkey Config:** Create a Config for retries and fallbacks (e.g., `gpt-4o` then `claude-3-sonnet`). +model_with_tools = model.bind_tools([GetWeather]) +response = model_with_tools.invoke("What's the weather in NYC?") +print(response.tool_calls) +``` - ```json - // Example Portkey Config for reliability - { - "strategy": { "mode": "fallback" }, - "targets": [ - { "override_params": {"model": "gpt-4o"}, "provider":"@vk_openai", "retry": {"count": 2} }, - { "override_params": {"model": "claude-3-sonnet-20240229"}, "provider":"@vk_anthropic" } - ] - } - ``` - Assume saved Config ID is `cfg_reliable_123`. +## Dynamic Model Selection -2. **Use Config ID in `createHeaders`:** +For dynamic model routing based on query complexity or task type, use **Portkey Configs** with conditional routing: ```python from langchain_openai import ChatOpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import os - -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -PORTKEY_RELIABLE_CONFIG_ID = "cfg_reliable_123" - -portkey_reliable_headers = createHeaders( - api_key=PORTKEY_API_KEY, - provider="@openai" +from portkey_ai import createHeaders + +# Define routing config (created in Portkey dashboard) +config = { + "strategy": { + "mode": "conditional", + "conditions": [ + { + "query": {"metadata.complexity": {"$eq": "simple"}}, + "then": "cheap-model" + }, + { + "query": {"metadata.complexity": {"$eq": "complex"}}, + "then": "advanced-model" + } + ], + "default": "cheap-model" + }, + "targets": [ + { + "name": "cheap-model", + "override_params": {"model": "@openai-prod/gpt-4o-mini"} + }, + { + "name": "advanced-model", + "override_params": {"model": "@openai-prod/o1"} + } + ] +} + +model = ChatOpenAI( + model="gpt-4o", # Default model + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY", + default_headers=createHeaders(config=config) ) -llm_reliable = ChatOpenAI( - api_key="placeholder_key", - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_reliable_headers +# Route to cheap model +response1 = model.invoke( + "What is 2+2?", + config={"metadata": {"complexity": "simple"}} ) -# response = llm_reliable.invoke("Poem on resilient AI.") +# Route to advanced model +response2 = model.invoke( + "Solve this differential equation...", + config={"metadata": {"complexity": "complex"}} +) ``` -Offload complex logic to Portkey Configs, keeping Langchain code clean. +### Use Cases ---- +**1. Cost Optimization** -## 4. Full Observability +Route by query complexity automatically: -Routing Langchain `ChatOpenAI` via Portkey provides instant, comprehensive observability: +```python +def smart_invoke(prompt, complexity="simple"): + return model.invoke( + prompt, + config={"metadata": {"complexity": complexity}} + ) + +# Automatic routing +answer1 = smart_invoke("What is 2+2?", complexity="simple") +answer2 = smart_invoke("Explain quantum mechanics", complexity="complex") +``` -* **Logged Requests:** Detailed logs of requests, responses, latencies, costs. -* **Tracing:** Understand call lifecycles. -* **Performance Analytics:** Monitor metrics, track usage. -* **Debugging:** Pinpoint errors quickly. +**2. Model Specialization by Task** -This is crucial for monitoring and optimizing production Langchain apps. +Route different task types to specialized models: - - - +```python +config = { + "strategy": { + "mode": "conditional", + "conditions": [ + {"query": {"metadata.task": {"$eq": "code"}}, "then": "coding-model"}, + {"query": {"metadata.task": {"$eq": "creative"}}, "then": "creative-model"} + ], + "default": "coding-model" + }, + "targets": [ + { + "name": "coding-model", + "override_params": {"model": "@openai-prod/gpt-4o"} + }, + { + "name": "creative-model", + "override_params": {"model": "@anthropic-prod/claude-sonnet-4"} + } + ] +} ---- +def route_by_task(prompt, task_type): + return model.invoke( + prompt, + config={"metadata": {"task": task_type}} + ) -## 5. Prompt Management +code = route_by_task("Write a sorting algorithm", task_type="code") +story = route_by_task("Write a sci-fi story", task_type="creative") +``` -Portkey's Prompt Library helps manage prompts effectively: +**3. Dynamic Agent Model Selection** -* **Version Control:** Store and track prompt changes. -* **Parameterized Prompts:** Use variables with [mustache templating](/product/prompt-library/prompt-templates#templating-engine). -* **Sandbox:** Test prompts with different LLMs in Portkey. +Use different models for different agent steps: -### Using Portkey Prompts in Langchain +```python +from langchain.agents import create_agent +from langchain.tools import tool + +@tool +def complex_calculation(query: str) -> str: + """Perform complex calculations.""" + return "42" + +model = ChatOpenAI( + model="gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY", + default_headers=createHeaders(config=config) +) -1. Create prompt in Portkey, get `Prompt ID`. -2. Use Portkey SDK to render prompt with variables. -3. Transform rendered prompt to Langchain message format. -4. Pass messages to Portkey-configured `ChatOpenAI`. +agent = create_agent(model, tools=[complex_calculation]) -```python -import os -from langchain_openai import ChatOpenAI -from langchain_core.messages import SystemMessage, HumanMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders, Portkey +# Agent routes based on task complexity +result = agent.invoke({ + "messages": [{"role": "user", "content": "Calculate quantum probabilities"}], + "metadata": {"complexity": "complex"} +}) +``` -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") -client = Portkey(api_key=PORTKEY_API_KEY) +All routing decisions are tracked in Portkey with full observability—see which models were used, costs per model, and performance comparisons. -PROMPT_ID = "pp-story-generator" # Your Portkey Prompt ID + + Learn more about conditional routing and advanced patterns + -rendered_prompt = client.prompts.render( - prompt_id=PROMPT_ID, - variables={"character": "brave knight", "object": "magic sword"} -).data +### When to Use Dynamic Routing -langchain_messages = [] -if rendered_prompt and rendered_prompt.prompt: - for msg in rendered_prompt.prompt: - if msg.get("role") == "user": langchain_messages.append(HumanMessage(content=msg.get("content"))) - elif msg.get("role") == "system": langchain_messages.append(SystemMessage(content=msg.get("content"))) +**Use conditional routing** when you need: +- ✅ Cost optimization based on query complexity +- ✅ Model specialization by task type +- ✅ Automatic failover and fallbacks +- ✅ A/B testing with traffic distribution -portkey_headers = createHeaders(api_key=PORTKEY_API_KEY, provider="@openai") +**Use fixed models** when you need: +- ✅ Simple, predictable behavior +- ✅ Consistent model across all requests +- ✅ Easier debugging -llm_portkey_prompt = ChatOpenAI( - api_key="placeholder_key", - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_headers, - model=rendered_prompt.model if rendered_prompt and rendered_prompt.model else "gpt-4o" -) +## Advanced Features via Configs + +For production features like fallbacks, caching, and load balancing, use Portkey Configs: -# if langchain_messages: response = llm_portkey_prompt.invoke(langchain_messages) +```python +from portkey_ai import createHeaders + +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY", + default_headers=createHeaders( + config="pc_your_config_id" # Created in Portkey dashboard + ) +) ``` -Manage prompts centrally in Portkey for versioning and collaboration. + + Set up fallbacks, retries, caching, load balancing, and more + ---- +## Langchain Embeddings -## 6. Secure Virtual Keys +Create embeddings via Portkey: -Portkey's [Virtual Keys](/product/ai-gateway/virtual-keys) are vital for secure, flexible LLM ops with Langchain. +```python +from langchain_openai import OpenAIEmbeddings -**Benefits:** -* **Secure Credentials:** Store provider API keys in Portkey's vault. Code uses Virtual Key IDs. -* **Easy Configuration:** Switch providers/keys by changing `virtual_key` in `createHeaders`. -* **Access Control:** Manage Virtual Key permissions in Portkey. -* **Auditability:** Track usage via Portkey logs. +embeddings = OpenAIEmbeddings( + model="text-embedding-3-small", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY", + default_headers={"x-portkey-provider": "@openai-prod"} +) -Using Virtual Keys boosts security and simplifies config management. +vectors = embeddings.embed_documents(["Hello world", "Goodbye world"]) +``` ---- + +Portkey supports OpenAI embeddings via `OpenAIEmbeddings`. For other providers (Cohere, Voyage), use the **Portkey SDK directly** ([docs](/api-reference/inference-api/embeddings)). + -## Langchain Embeddings +## Prompt Management -Create embeddings with `OpenAIEmbeddings` via Portkey. +Use prompts from Portkey's Prompt Library: ```python -from langchain_openai import OpenAIEmbeddings -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders import os +from langchain_openai import ChatOpenAI +from langchain_core.messages import SystemMessage, HumanMessage +from portkey_ai import Portkey PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") +client = Portkey(api_key=PORTKEY_API_KEY) -portkey_headers = createHeaders(api_key=PORTKEY_API_KEY, provider="@openai") +# Render prompt from Portkey +rendered_prompt = client.prompts.render( + prompt_id="pp-story-generator", + variables={"character": "brave knight", "object": "magic sword"} +).data -embeddings_model = OpenAIEmbeddings( - api_key="placeholder_key", - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_headers, - model="text-embedding-3-small" +# Convert to Langchain messages +langchain_messages = [] +if rendered_prompt and rendered_prompt.prompt: + for msg in rendered_prompt.prompt: + if msg.get("role") == "user": + langchain_messages.append(HumanMessage(content=msg.get("content"))) + elif msg.get("role") == "system": + langchain_messages.append(SystemMessage(content=msg.get("content"))) + +# Use with Langchain model +model = ChatOpenAI( + model="@openai-prod/gpt-4o", + base_url="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -# embeddings = embeddings_model.embed_documents(["Hello world!", "Test."]) +response = model.invoke(langchain_messages) ``` - -Portkey supports OpenAI embeddings via Langchain's `OpenAIEmbeddings`. For other providers (Cohere, Gemini), use the **Portkey SDK directly** ([docs](/api-reference/inference-api/embeddings)). - - ---- + + Manage, version, and test prompts in Portkey + -## Langchain Chains & Prompts +## Migration from Direct OpenAI -Standard Langchain `Chains` and `PromptTemplates` work seamlessly with Portkey-configured `ChatOpenAI` instances. Portkey features (logging, caching) apply automatically. +Already using Langchain with OpenAI? Just update 3 parameters: ```python +# Before from langchain_openai import ChatOpenAI -from langchain_core.prompts import ChatPromptTemplate -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders import os -PORTKEY_API_KEY = os.environ.get("PORTKEY_API_KEY") - -portkey_headers = createHeaders(api_key=PORTKEY_API_KEY, provider="@openai") +model = ChatOpenAI( + model="gpt-4o", + api_key=os.getenv("OPENAI_API_KEY"), + temperature=0.7 +) -chat_llm = ChatOpenAI( - api_key="placeholder_key", - base_url=PORTKEY_GATEWAY_URL, - default_headers=portkey_headers, - model="gpt-4o" +# After (add 2 parameters, change 1) +model = ChatOpenAI( + model="@openai-prod/gpt-4o", # Add provider slug + base_url="https://api.portkey.ai/v1", # Add this + api_key="PORTKEY_API_KEY", # Change to Portkey key + temperature=0.7 # Keep existing params ) +``` -prompt = ChatPromptTemplate.from_messages([ - ("system", "You are a world-class technical writer."), - ("user", "{input}") -]) +**Benefits:** +- Zero code changes to your existing Langchain logic +- Instant observability for all requests +- Production-grade reliability features +- Cost controls and budgets -chain = prompt | chat_llm +## Next Steps -# response = chain.invoke({"input": "Explain API gateways simply."}) -``` + + + Set up providers, budgets, and access control + + + Configure fallbacks, caching, and routing + + + Track costs, performance, and usage + + + Add PII detection and content filtering + + -All chain requests via `chat_llm` are processed by Portkey. +For complete SDK documentation: -This concludes the main features. Redundant examples have been removed for clarity. + + Complete Portkey SDK documentation + diff --git a/integrations/libraries/llama-index-python.mdx b/integrations/libraries/llama-index-python.mdx index a62a9e64..ef407ce7 100644 --- a/integrations/libraries/llama-index-python.mdx +++ b/integrations/libraries/llama-index-python.mdx @@ -1,895 +1,442 @@ --- title: "LlamaIndex (Python)" -description: The **Portkey x LlamaIndex** integration brings advanced **AI gateway** capabilities, full-stack **observability**, and **prompt management** to apps built on LlamaIndex. +description: "Add Portkey's enterprise features to any LlamaIndex app—observability, reliability, caching, and cost control." --- +LlamaIndex provides a framework for building LLM applications with your data. Add Portkey to get production-grade features: full observability, automatic fallbacks, semantic caching, and cost controls—all without changing your LlamaIndex code. +## Quick Start -In a nutshell, Portkey extends the familiar OpenAI schema to make Llamaindex work with **1600+ LLMs** without the need for importing different classes for each provider or having to configure your code separately. Portkey makes your Llamaindex apps _reliable_, _fast_, and _cost-efficient_. - -## Getting Started - -### 1\. Install the Portkey SDK - -```sh -pip install -U portkey-ai -``` - -### 2\. Import the necessary classes and functions - -Import the `OpenAI` class in Llamaindex as you normally would, along with Portkey's helper functions `createHeaders` and `PORTKEY_GATEWAY_URL`. - -```py -from llama_index.llms.openai import OpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -``` - -### 3\. Configure model details - -Configure your model details using Portkey's [**Config object schema**](/api-reference/config-object). This is where you can define the provider and model name, model parameters, set up fallbacks, retries, and more. - -```py -config = { - "provider":"openai", - "api_key":"YOUR_OPENAI_API_KEY", - "override_params": { - "model":"gpt-4o", - "max_tokens":64 - } -} -``` - -### 4\. Pass Config details to OpenAI client with necessary headers - -```py -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) -) -``` - -## Example: OpenAI - -Here are basic integrations examples on using the `complete` and `chat` methods with `streaming` on & off. - - - +Add Portkey to any LlamaIndex app with 3 parameters: ```python from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = { - "provider":"openai", - "api_key":"YOUR_OPENAI_API_KEY", - "override_params": { - "model":"gpt-4o", - "max_tokens":64 - } -} - -#### You can also reference a saved Config #### -#### config = "pc-anthropic-xx" -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) +llm = OpenAI( + model="@openai-prod/gpt-4o", # Provider slug from Model Catalog + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" # Your Portkey API key ) -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] - -resp = portkey.chat(messages) -print(resp) - -##### Streaming Mode ##### - -resp = portkey.stream_chat(messages) - -for r in resp: - print(r.delta, end="") -``` -> assistant: Arrr, matey! They call me Captain Barnacle Bill, the most colorful pirate to ever sail the seven seas! With a parrot on me shoulder and a treasure map in me hand, I'm always ready for adventure! What be yer name, landlubber? - - - - -```python -from llama_index.llms.openai import OpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = { - "provider":"openai", - "api_key":"YOUR_OPENAI_API_KEY", - "override_params": { - "model":"gpt-4o", - "max_tokens":64 - } -} - -#### You can also reference a saved Config #### -#### config = "pc-anthropic-xx" - -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) -) - -resp=portkey.complete("Paul Graham is ") -print(resp) - -##### Streaming Mode ##### - -resp=portkey.stream_complete("Paul Graham is ") -for r in resp: - print(r.delta, end="") +response = llm.complete("Tell me a joke") +print(response.text) ``` -> a computer scientist, entrepreneur, and venture capitalist. He is best known for co-founding the startup accelerator Y Combinator and for his work on programming languages and web development. Graham is also a prolific writer and has published essays on a wide range of topics, including startups, technology, and education. - - -```python -import asyncio -from llama_index.llms.openai import OpenAI -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = { - "provider":"openai", - "api_key":"YOUR_OPENAI_API_KEY", - "override_params": { - "model":"gpt-4o", - "max_tokens":64 - } -} - -#### You can also reference a saved Config #### -#### config = "pc-anthropic-xx" - -async def main(): - portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="PORTKEY_API_KEY", - config=config - ) - ) - - resp = await portkey.acomplete("Paul Graham is ") - print(resp) - - ##### Streaming Mode ##### - - resp = await portkey.astream_complete("Paul Graham is ") - async for delta in resp: - print(delta.delta, end="") - -asyncio.run(main()) -``` - - + + + -## Enabling Portkey Features +That's it! You now get: +- ✅ Full observability (costs, latency, logs) +- ✅ Dynamic model selection per request +- ✅ Automatic fallbacks and retries (via configs) +- ✅ Budget controls per team/project -By routing your LlamaIndex requests through Portkey, you get access to the following production-grade features: +## Why Add Portkey to LlamaIndex? - - -

Call various LLMs like Anthropic, Gemini, Mistral, Azure OpenAI, Google Vertex AI, and AWS Bedrock with minimal code changes.

-
+LlamaIndex handles data indexing and querying. Portkey adds production features: - -

Speed up your requests and save money on LLM calls by storing past responses in the Portkey cache. Choose between Simple and Semantic cache modes.

+ + + Every request logged with costs, latency, tokens. Team-level analytics and debugging. - - -

Set up fallbacks between different LLMs or providers, load balance your requests across multiple instances or API keys, set automatic retries, and request timeouts.

+ + Switch models per request. Route simple queries to cheap models, complex to advanced—automatically tracked. - - -

Portkey automatically logs all the key details about your requests, including cost, tokens used, response time, request and response bodies, and more. Send custom metadata and trace IDs for better analytics and debugging.

+ + Automatic fallbacks, smart retries, load balancing—configured once, works everywhere. - - -

Use Portkey as a centralized hub to store, version, and experiment with prompts across multiple LLMs, and seamlessly retrieve them in your LlamaIndex app for easy integration.

+ + Budget limits per team/project. Rate limiting. Centralized credential management. +
- -

Improve your LlamaIndex app by capturing qualitative & quantitative user feedback on your requests.

-
+## Setup - -

Set budget limits on provider API keys and implement fine-grained user roles and permissions for both the app and the Portkey APIs.

-
-
+### 1. Install Packages +```bash +pip install llama-index-llms-openai portkey-ai +``` -Much of these features are driven by **Portkey's Config architecture**. On the Portkey app, we make it easy to help you _create_, _manage_, and _version_ your Configs so that you can reference them easily in Llamaindex. +### 2. Add Provider in Model Catalog -## Saving Configs in the Portkey App +1. Go to [**Model Catalog → Add Provider**](https://app.portkey.ai/model-catalog/providers) +2. Select your provider (OpenAI, Anthropic, Google, etc.) +3. Choose existing credentials or create new by entering your API keys +4. Name your provider (e.g., `openai-prod`) -Head over to the Configs tab in Portkey app where you can save various provider Configs along with the reliability and caching features. Each Config has an associated slug that you can reference in your Llamaindex code. +Your provider slug will be **`@openai-prod`** (or whatever you named it). - - - -## Overriding a Saved Config + + Set up budgets, rate limits, and manage credentials + -If you want to use a saved Config from the Portkey app in your LlamaIndex code but need to modify certain parts of it before making a request, you can easily achieve this using Portkey's Configs API. This approach allows you to leverage the convenience of saved Configs while still having the flexibility to adapt them to your specific needs. +### 3. Get Portkey API Key -#### Here's an example of how you can fetch a saved Config using the Configs API and override the `model` parameter: +Create your Portkey API key at [app.portkey.ai/api-keys](https://app.portkey.ai/api-keys) -```py Overriding Model in a Saved Config -from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -import requests -import os +### 4. Use in Your Code -def create_config(config_slug,model): - url = f'https://api.portkey.ai/v1/configs/{config_slug}' - headers = { - 'x-portkey-api-key': os.environ.get("PORTKEY_API_KEY"), - 'content-type': 'application/json' - } - response = requests.get(url, headers=headers).json() - config = json.loads(response['config']) - config['override_params']['model']=model - return config +Replace your existing LLM initialization: -config=create_config("pc-llamaindex-xx","gpt-4-turbo") +```python +# Before (direct to OpenAI) +from llama_index.llms.openai import OpenAI -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key=os.environ.get("PORTKEY_API_KEY"), - config=config - ) +llm = OpenAI( + model="gpt-4o", + api_key="OPENAI_API_KEY" ) -messages = [ChatMessage(role="user", content="1729")] - -resp = portkey.chat(messages) -print(resp) +# After (via Portkey) +llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) ``` -In this example: +**That's the only change needed!** All your existing LlamaIndex code (indexes, query engines, agents) works exactly the same. -1. We define a helper function `get_customized_config` that takes a `config_slug` and a `model` as parameters. -2. Inside the function, we make a GET request to the Portkey Configs API endpoint to fetch the saved Config using the provided `config_slug`. -3. We extract the `config` object from the API response. -4. We update the `model` parameter in the `override_params` section of the Config with the provided `custom_model`. -5. Finally, we return the customized Config. +## Switching Between Providers -We can then use this customized Config when initializing the OpenAI client from LlamaIndex, ensuring that our specific `model` override is applied to the saved Config. - -For more details on working with Configs in Portkey, refer to the [**Config documentation**.](/product/ai-gateway/configs) - ---- - -## 1\. Interoperability - Calling Anthropic, Gemini, Mistral, and more - -Now that we have the OpenAI code up and running, let's see how you can use Portkey to send the request across multiple LLMs - we'll show **Anthropic**, **Gemini**, and **Mistral**. For the full list of providers & LLMs supported, check out [**this doc**](/guides/integrations). - -Switching providers just requires **changing 3 lines of code:** - -1. Change the `provider name` -2. Change the `API key`, and -3. Change the `model name` - - - - +Just change the model string—everything else stays the same: ```python from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -config = { - "provider":"anthropic", - "api_key":"YOUR_ANTHROPIC_API_KEY", - "override_params": { - "model":"claude-3-opus-20240229", - "max_tokens":64 - } -} +# OpenAI +llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) -#### You can also reference a saved Config #### -#### config = "pc-anthropic-xx" +# Anthropic +llm = OpenAI( + model="@anthropic-prod/claude-sonnet-4", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" +) -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) +# Google Gemini +llm = OpenAI( + model="@google-prod/gemini-2.0-flash", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) +``` -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] + +Portkey implements OpenAI-compatible APIs for all providers, so you always use `llama_index.llms.openai.OpenAI` regardless of which model you're calling. + -resp = portkey.chat(messages) -print(resp) -``` +## Using with LlamaIndex Chat - - +LlamaIndex's chat interface works seamlessly: ```python from llama_index.llms.openai import OpenAI from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = { - "provider":"google", - "api_key":"YOUR_GOOGLE_GEMINI_API_KEY", - "override_params": { - "model":"gemini-1.5-flash-latest", - "max_tokens":64 - } -} - -#### You can also reference a saved Config instead #### -#### config = "pc-gemini-xx" -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) +llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), + ChatMessage(role="system", content="You are a helpful assistant"), + ChatMessage(role="user", content="What is the capital of France?") ] -resp = portkey.chat(messages) -print(resp) +response = llm.chat(messages) +print(response.message.content) ``` - - -```python -from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders +## Works With All LlamaIndex Features -config = { - "provider":"mistral-ai", - "api_key":"YOUR_MISTRAL_AI_API_KEY", - "override_params": { - "model":"codestral-latest", - "max_tokens":64 - } -} +✅ **Query Engines** - All query types supported +✅ **Chat Engines** - Conversational interfaces +✅ **Agents** - Full agent compatibility +✅ **Streaming** - Token-by-token streaming +✅ **RAG Pipelines** - Retrieval-augmented generation +✅ **Workflows** - Complex LLM workflows -#### You can also reference a saved Config instead #### -#### config = "pc-mistral-xx" +### Streaming -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) +```python +from llama_index.llms.openai import OpenAI + +llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] +# Stream completions +for chunk in llm.stream_complete("Write a short story"): + print(chunk.delta, end="", flush=True) -resp = portkey.chat(messages) -print(resp) +# Stream chat +messages = [ChatMessage(role="user", content="Tell me a joke")] +for chunk in llm.stream_chat(messages): + print(chunk.delta, end="", flush=True) ``` - - - -### Calling Azure, Google Vertex, AWS Bedrock - -We recommend saving your cloud details to [**Portkey vault**](/product/ai-gateway/virtual-keys) and getting a corresponding Virtual Key. - -[**Explore the Virtual Key documentation here**](/product/ai-gateway/virtual-keys)**.** - - - - - - +### Async Support ```python +import asyncio from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -config = { - "provider:"@AZURE_OPENAI_PORTKEY_PROVIDER" -} - -#### You can also reference a saved Config instead #### -#### config = "pc-azure-xx" - -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config +async def main(): + llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -) + + # Async completion + response = await llm.acomplete("What is 2+2?") + print(response.text) -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] + # Async streaming + async for chunk in await llm.astream_complete("Write a haiku"): + print(chunk.delta, end="", flush=True) -resp = portkey.chat(messages) -print(resp) +asyncio.run(main()) ``` - - +### RAG with Query Engine ```python +from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = { - "provider:"@AWS_BEDROCK_PORTKEY_PROVIDER" -} - -#### You can also reference a saved Config instead #### -#### config = "pc-bedrock-xx" -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) +# Set up LLM with Portkey +llm = OpenAI( + model="@openai-prod/gpt-4o", + api_base="https://api.portkey.ai/v1", + api_key="PORTKEY_API_KEY" ) -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] +# Load and index documents +documents = SimpleDirectoryReader("data").load_data() +index = VectorStoreIndex.from_documents(documents) -resp = portkey.chat(messages) -print(resp) +# Query with Portkey-enabled LLM +query_engine = index.as_query_engine(llm=llm) +response = query_engine.query("What is the main topic?") +print(response) ``` - - - - Vertex AI uses OAuth2 to authenticate its requests, so you need to send the **access token** additionally along with the request - you can do this while by sending it as the `api_key` in the OpenAI client. Run `gcloud auth print-access-token` in your terminal to get your Vertex AI access token. - +## Advanced Features via Configs + +For production features like fallbacks, caching, and load balancing, use Portkey Configs: ```python from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -config = { - "provider:"@VERTEX_AI_PORTKEY_PROVIDER" -} - -#### You can also reference a saved Config instead #### -#### config = "pc-vertex-xx" - -portkey = OpenAI( - api_key="YOUR_VERTEX_AI_ACCESS_TOKEN", # Get by running gcloud auth print-access-token in terminal +llm = OpenAI( + model="gpt-4o", # Default model api_base=PORTKEY_GATEWAY_URL, + api_key="PORTKEY_API_KEY", default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config + config="pc_your_config_id" # Created in Portkey dashboard ) ) - -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] - -resp = portkey.chat(messages) -print(resp) ``` - - - -### Calling Local or Privately Hosted Models like Ollama -Check out [**Portkey docs for Ollama**](/integrations/llms/ollama) and [**other privately hosted models**](/integrations/llms/byollm). +### Example: Fallbacks -```py Ollama +```python from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders config = { - "provider":"ollama", - "custom_host":"https://7cc4-3-235-157-146.ngrok-free.app", # Your Ollama ngrok URL - "override_params": { - "model":"llama3" - } + "strategy": {"mode": "fallback"}, + "targets": [ + {"override_params": {"model": "@openai-prod/gpt-4o"}}, + {"override_params": {"model": "@anthropic-prod/claude-sonnet-4"}} + ] } -#### You can also reference a saved Config instead #### -#### config = "pc-azure-xx" - -portkey = OpenAI( +llm = OpenAI( + model="gpt-4o", api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config - ) + api_key="PORTKEY_API_KEY", + default_headers=createHeaders(config=config) ) -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] - -resp = portkey.chat(messages) -print(resp) +# Automatically falls back to Anthropic if OpenAI fails +response = llm.complete("Hello!") ``` -[**Explore full list of the providers supported on Portkey here**](/guides/integrations). - ---- - -## 2\. Caching - -You can speed up your requests and save money on your LLM requests by storing past responses in the Portkey cache. There are 2 cache modes: - -* **Simple:** Matches requests verbatim. Perfect for repeated, identical prompts. Works on **all models** including image generation models. -* **Semantic:** Matches responses for requests that are semantically similar. Ideal for denoising requests with extra prepositions, pronouns, etc. - -To enable Portkey cache, just add the `cache` params to your [config object](https://portkey.ai/docs/api-reference/config-object#cache-object-details). - - - +### Example: Load Balancing ```python config = { - "provider":"mistral-ai", - "api_key":"YOUR_MISTRAL_AI_API_KEY", - "override_params": { - "model":"codestral-latest", - "max_tokens":64 - }, - "cache": { - "mode": "simple", - "max_age": 60000 - } + "strategy": {"mode": "loadbalance"}, + "targets": [ + {"override_params": {"model": "@openai-prod/gpt-4o"}, "weight": 0.5}, + {"override_params": {"model": "@anthropic-prod/claude-sonnet-4"}, "weight": 0.5} + ] } -``` - - +llm = OpenAI( + model="gpt-4o", + api_base=PORTKEY_GATEWAY_URL, + api_key="PORTKEY_API_KEY", + default_headers=createHeaders(config=config) +) -```python -config = { - "provider":"mistral-ai", - "api_key":"YOUR_MISTRAL_AI_API_KEY", - "override_params": { - "model":"codestral-latest", - "max_tokens":64 - }, - "cache": { - "mode": "semantic", - "max_age": 60000 - } -} +# Requests distributed 50/50 between OpenAI and Anthropic +response = llm.complete("Hello!") ``` - - -[**For more cache settings, check out the documentation here**](/product/ai-gateway/cache-simple-and-semantic)**.** ---- - -## 3\. Reliability - -Set up fallbacks between different LLMs or providers, load balance your requests across multiple instances or API keys, set automatic retries, or set request timeouts - all set through **Configs**. - - - - +### Example: Caching ```python config = { - "strategy": { - "mode": "fallback" - }, - "targets": [ - { - "provider":"@openai-virtual-key", - "override_params": { - "model": "gpt-4o" - } - }, - { - "provider":"@anthropic-virtual-key", - "override_params": { - "model": "claude-3-opus-20240229", - "max_tokens":64 - } - } - ] -} -``` - - - -```python -config = { - "strategy": { - "mode": "loadbalance" - }, - "targets": [ - { - "provider":"@openai-virtual-key-1", - "weight":1 + "cache": { + "mode": "semantic", # or "simple" for exact matches + "max_age": 3600 # Cache for 1 hour }, - { - "provider":"@openai-virtual-key-2", - "weight":1 - } - ] + "override_params": {"model": "@openai-prod/gpt-4o"} } -``` - - -```python -config = { - "retry": { - "attempts": 5 - }, - "provider":"@virtual-key-xxx" -} -``` - - +llm = OpenAI( + model="gpt-4o", + api_base=PORTKEY_GATEWAY_URL, + api_key="PORTKEY_API_KEY", + default_headers=createHeaders(config=config) +) -```python -config = { - "strategy": { "mode": "fallback" }, - "request_timeout": 10000, - "targets": [ - { "provider":"@open-ai-xxx" }, - { "provider":"@azure-open-ai-xxx" } - ] -} +# Responses cached for similar queries +response = llm.complete("What is machine learning?") ``` - - - - -Explore deeper documentation for each feature here - [**Fallbacks**](/product/ai-gateway/fallbacks), [**Loadbalancing**](/product/ai-gateway/load-balancing), [**Retries**](/product/ai-gateway/automatic-retries), [**Timeouts**](/product/ai-gateway/request-timeouts). - -## 4\. Observability + + Set up fallbacks, retries, caching, load balancing, and more + -Portkey automatically logs all the key details about your requests, including cost, tokens used, response time, request and response bodies, and more. +## Observability -Using Portkey, you can also send custom metadata with each of your requests to further segment your logs for better analytics. Similarly, you can also trace multiple requests to a single trace ID and filter or view them separately in Portkey logs. - -**Custom Metadata and Trace ID information is sent in** `default_headers` **.** - - - +Portkey automatically logs all requests. Add custom metadata for better analytics: ```python from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders -config = "pc-xxxx" - -portkey = OpenAI( +llm = OpenAI( + model="@openai-prod/gpt-4o", api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set + api_key="PORTKEY_API_KEY", default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config, metadata={ - "_user": "USER_ID", + "_user": "user_123", "environment": "production", - "session_id": "1729" - } + "feature": "rag_query" + }, + trace_id="unique_trace_id" ) ) - -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] - -resp = portkey.chat(messages) -print(resp) ``` - - - -```python -from llama_index.llms.openai import OpenAI -from llama_index.core.llms import ChatMessage -from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders - -config = "pc-xxxx" - -portkey = OpenAI( - api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key="YOUR_PORTKEY_API_KEY", - config=config, - trace_id="YOUR_TRACE_ID_HERE" - ) -) - -messages = [ - ChatMessage(role="system", content="You are a pirate with a colorful personality"), - ChatMessage(role="user", content="What is your name"), -] - -resp = portkey.chat(messages) -print(resp) -``` - - - -#### Portkey shows these details separately for each log: - - - - - -[**Check out Observability docs here.**](/product/observability) - +Filter and analyze logs by metadata in the Portkey dashboard. -## 5\. Prompt Management + + Track costs, performance, and debug issues + -Portkey features an advanced Prompts platform tailor-made for better prompt engineering. With Portkey, you can: +## Prompt Management -* **Store Prompts with Access Control and Version Control:** Keep all your prompts organized in a centralized location, easily track changes over time, and manage edit/view permissions for your team. -* **Parameterize Prompts**: Define variables and [mustache-approved tags](/product/prompt-library/prompt-templates#templating-engine) within your prompts, allowing for dynamic value insertion when calling LLMs. This enables greater flexibility and reusability of your prompts. -* **Experiment in a Sandbox Environment**: Quickly iterate on different LLMs and parameters to find the optimal combination for your use case, without modifying your LlamaIndex code. +Use prompts from Portkey's Prompt Library: -#### Here's how you can leverage Portkey's Prompt Management in your LlamaIndex application: - -1. Create your prompt template on the Portkey app, and save it to get an associated `Prompt ID` -2. Before making a Llamaindex request, render the prompt template using the Portkey SDK -3. Transform the retrieved prompt to be compatible with LlamaIndex and send the request! - -#### Example: Using a Portkey Prompt Template in LlamaIndex - -```py Portkey Prompts in LlamaIndex -import json -import os +```python from llama_index.llms.openai import OpenAI from llama_index.core.llms import ChatMessage from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders, Portkey -### Initialize Portkey client with API key - -client = Portkey(api_key=os.environ.get("PORTKEY_API_KEY")) - -### Render the prompt template with your prompt ID and variables - +# Render prompt from Portkey +client = Portkey(api_key="PORTKEY_API_KEY") prompt_template = client.prompts.render( - prompt_id="pp-prompt-id", - variables={ "movie":"Dune 2" } + prompt_id="pp-your-prompt-id", + variables={"topic": "AI"} ).data.dict() -config = { - "provider:"@GROQ_PROVIDER", # You need to send the virtual key separately - "override_params":{ - "model":prompt_template["model"], # Set the model name based on the value in the prompt template - "temperature":prompt_template["temperature"] # Similarly, you can also set other model params - } -} - -portkey = OpenAI( +# Use with LlamaIndex +llm = OpenAI( + model="@openai-prod/gpt-4o", api_base=PORTKEY_GATEWAY_URL, - api_key="xx" # Placeholder, no need to set - default_headers=createHeaders( - api_key=os.environ.get("PORTKEY_API_KEY"), - config=config - ) + api_key="PORTKEY_API_KEY" ) -### Transform the rendered prompt into LlamaIndex-compatible format - -messages = [ChatMessage(content=msg["content"], role=msg["role"]) for msg in prompt_template["messages"]] - -resp = portkey.chat(messages) +messages = [ + ChatMessage(content=msg["content"], role=msg["role"]) + for msg in prompt_template["messages"] +] -print(resp) +response = llm.chat(messages) +print(response.message.content) ``` -[**Explore Prompt Management docs here**](/product/prompt-library). - ---- + + Manage, version, and test prompts in Portkey + -## 6\. Continuous Improvement +## Migration from Direct OpenAI -Now that you know how to trace & log your Llamaindex requests to Portkey, you can also start capturing user feedback to improve your app! +Already using LlamaIndex with OpenAI? Just update 3 parameters: -You can append qualitative as well as quantitative feedback to any `trace ID` with the `portkey.feedback.create` method: - -```py Adding Feedback -from portkey_ai import Portkey +```python +# Before +from llama_index.llms.openai import OpenAI +import os -portkey = Portkey( - api_key="PORTKEY_API_KEY" +llm = OpenAI( + model="gpt-4o", + api_key=os.getenv("OPENAI_API_KEY"), + temperature=0.7 ) -feedback = portkey.feedback.create( - trace_id="YOUR_LLAMAINDEX_TRACE_ID", - value=5, # Integer between -10 and 10 - weight=1, # Optional - metadata={ - # Pass any additional context here like comments, _user and more - } +# After (add 2 parameters, change 1) +llm = OpenAI( + model="@openai-prod/gpt-4o", # Add provider slug + api_base="https://api.portkey.ai/v1", # Add this + api_key="PORTKEY_API_KEY", # Change to Portkey key + temperature=0.7 # Keep existing params ) - -print(feedback) ``` -[**Check out the Feedback documentation for a deeper dive**](/product/observability/feedback). - - -## 7\. Security & Compliance +**Benefits:** +- Zero code changes to your existing LlamaIndex logic +- Instant observability for all requests +- Production-grade reliability features +- Cost controls and budgets -When you onboard more team members to help out on your Llamaindex app - permissioning, budgeting, and access management can become a mess! Using Portkey, you can set **budget limits** on provide API keys and implement **fine-grained user roles** and **permissions** to: +## Next Steps -* **Control access**: Restrict team members' access to specific features, Configs, or API endpoints based on their roles and responsibilities. -* **Manage costs**: Set budget limits on API keys to prevent unexpected expenses and ensure that your LLM usage stays within your allocated budget. -* **Ensure compliance**: Implement strict security policies and audit trails to maintain compliance with industry regulations and protect sensitive data. -* **Simplify onboarding**: Streamline the onboarding process for new team members by assigning them appropriate roles and permissions, eliminating the need to share sensitive API keys or secrets. -* **Monitor usage**: Gain visibility into your team's LLM usage, track costs, and identify potential security risks or anomalies through comprehensive monitoring and reporting. - - - - - -[**Read more about Portkey's Security & Enterprise offerings here**](/product/enterprise-offering). - - -## Join Portkey Community - -Join the Portkey Discord to connect with other practitioners, discuss your LlamaIndex projects, and get help troubleshooting your queries. + + + Set up providers, budgets, and access control + + + Configure fallbacks, caching, and routing + + + Track costs, performance, and usage + + + Add PII detection and content filtering + + -[**Link to Discord**](https://portkey.ai/community) +For complete SDK documentation: -For more detailed information on each feature and how to use them, please refer to the [Portkey Documentation](https://portkey.ai/docs). + + Complete Portkey SDK documentation +