Portkey-AI
diff --git a/‎changelog/2025/apr.mdx
Lines changed: 230 additions & 0 deletions b/‎changelog/2025/apr.mdx
Lines changed: 230 additions & 0 deletions
diff --git a/‎docs.json
Lines changed: 1 addition & 0 deletions b/‎docs.json
Lines changed: 1 addition & 0 deletions
diff --git a/‎images/changelog/testimonial.jpeg
90.2 KB b/‎images/changelog/testimonial.jpeg
90.2 KB
diff --git a/‎images/changelog/testimonial2.png
137 KB b/‎images/changelog/testimonial2.png
137 KB
diff --git a/‎package-lock.json
Lines changed: 1 addition & 1 deletion b/‎package-lock.json
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,230 @@
+---
+title: "April"
+---
+
+**April in review ◀️✨**
+
+We kicked off April with an announcement— we make 95% of LLM costs vanish overnight. Just a bait, some of you bit (≧ᗜ≦)
+
+While we can’t make bills disappear with a snap, we’ve delivered some powerful upgrades this month that will help you build and ship robust, reliable GenAI apps, faster!
+
+This month, we introduced updates to the platform and gateway around governance, security & guardrails, new integrations, and all the latest models! Along with this, we’re working on something bigger: [the missing piece](https://x.com/PortkeyAI/status/1912491547653701891) in the AI agents stack! 
+
+Here’s what we shipped last month:
+
+
+## Summary
+
+| Area | Key Updates |
+| :-- | :-- |
+| Platform | • Prompt CRUD APIs<br/>• Export logs to your internal stack<br/>• Budget limits and rate limits on workspace<br/>• n8n integration<br/>• OpenAI Codex CLI integration<br/>• New retry setting to determine wait times<br/>• Milvus for Semantic Cache<br/>• Plugins moved to org-level Settings<br/>• Virtual Key exhaustion alert includes workspace<br/>• Workspace control setup option |
+| Gateway & Providers | • OpenAI embeddings latency improvement (200ms)<br/>• Responses API for OpenAI & Azure OpenAI<br/>• Bedrock prompt caching via unified API<br/>• Virtual keys for self-hosted models<br/>• Tool calling support for Groq, OpenRouter, and Ollama<br/>• New providers: Dashscope, Recraft AI, Replicate, Azure AI Foundry<br/>• Enhanced parameter support: Openrouter, Vertex AI, Perplexity, Bedrock<br/>• Claude’s `anthropic_beta` parameter for Computer use beta |
+| Technical Improvements | • Unified caching/logging of thinking responses<br/>• Strict metadata logging: Workspace > API Key > Request<br/>• Prompt render endpoint available on Gateway URL<br/>• API key default config now locked from overrides |
+| New Models & Integrations | • GPT-4.1<br/>• Gemini 2.5 Pro and Flash<br/>• LLaMA 4 via Fireworks, Together, Groq<br/>• o1-pro<br/>• gpt-image-1<br/>• Qwen 3<br/>• Audio models via Groq |
+| Guardrails | • Azure AI Content Safety integration<br/>• Exa Online Search as a Guardrail |
+
+---
+
+## Platform
+
+**Prompt CRUD APIs**
+
+Prompt CRUD APIs give you the control to scale by enabling you to:
+
+- Programmatically create, update, and delete prompts
+- Manage prompts in bulk or version-control them
+- Integrate prompt updates into your own tools and workflows
+- Automate updates for A/B testing and rapid experimentation
+
+Read more about this [here](https://portkey.ai/docs/api-reference/admin-api/control-plane/prompts/create-prompt).
+
+**Export logs to your internal stack**
+
+Enterprises can now push analytics logs to any OTEL-compliant store through Portkey to centralize monitoring, maintain compliance, and ensure efficient operations.
+See how it's done [here](https://portkey.ai/docs/product/enterprise-offering/components#analytics-store)
+
+**Budget limits and rate limts on workspace**
+
+Configure budget and rate limits at the workspace level to:
+
+- Allocate specific budgets to different departments, teams, or projects
+- Prevent individual workspaces from consuming disproportionate resources
+- Ensure equitable API access and complete visibility
+
+**n8n integration**
+
+Add enterprise-grade controls to your n8n workflows with:
+
+- **Unified AI Gateway**: Connect to 1600+ models with full API key management—not just OpenAI or Anthropic.
+- **Centralized observability**: Track 40+ metrics and request logs in real time.
+- **Governance**: Monitor spend, set budgets, and apply RBAC across workflows.
+- **Security guardrails**: Enable PII detection, content filtering, and compliance controls.
+
+Read more about the integration[here](https://portkey.ai/docs/integrations/libraries/n8n)
+
+**OpenAI Codex CLI integration**
+
+OpenAI Codex CLI gives developers a streamlined way to analyze, modify, and execute code directly from their terminal. Portkey's integration enhances this experience with:
+
+- Access to 250+ additional models beyond OpenAI Codex CLI's standard offerings
+- Content filtering and PII detection with guardrails
+- Real-time analytics and logging
+- Cost attribution, budget controls, RBAC, and more!
+
+Read more about the integration[here](https://portkey.ai/docs/integrations/libraries/codex#openai-codex-cli)
+
+**Other updates**
+
+- Introduced a new retry setting `use_retry_after_header`. When set to true, if the provider returns the `retry-after` or `retry-after-ms headers`, the Gateway will use these headers to determine retry wait times, rather than applying the default exponential backoff for 429 responses.
+- You can now store and retrieve vector embeddings for semantic cache using Milvus in Portkey. Read more about semantic cache store [here](https://portkey.ai/docs/product/enterprise-offering/components#semantic-cache-store)
+- Plugins have now been moved under Settings (org-level) in the Portkey app.
+- Virtual Key exhaustion alert emails now include which workspace the exhausted key belonged to.
+- Set up your workspace with Workspace control on the Portkey app.
+
+## Gateway & Providers
+
+**OpenAI embeddings response**
+We’ve optimized the Gateway’s handling of OpenAI embeddings requests, leading to around 200ms improvement in response latency.
+
+**Responses API**
+
+You can now use the Responses API to access OpenAI and Azure OpenAI models on Portkey, enabling a flexible and easier way to create agentic experiences.
+
+- Complete observability and usage tracking
+- Caching support for streaming requests
+- Access to advanced tools — web search, file search, and code execution, with per-tool cost tracking
+
+**Bedrock prompt caching**
+
+You can now implement Amazon Bedrock’s prompt caching through our OpenAI-compliant unified API and prompt templates. 
+
+- Cache specific portions of your requests for repeated use
+- Reduce inference response latency and input token costs 
+
+Read more about the implementation [here](https://portkey.ai/docs/integrations/llms/bedrock/prompt-caching)
+
+**Virtual keys for self-hosted models**
+
+You can now create a virtual key for any self-hosted model - whether you're running Ollama, vLLM, or any custom/private model.
+
+- No extra setup required
+- Stay in control with logs, traces, and key metrics
+- Manage all your LLM interactions through one interface
+
+**Advanced capabilities**
+
+- **Openrouter**: Added mapping for new parameters - modalities, reasoning, transforms, provider, models, response_format.
+- **Vertex AI**: Added support for explicitly mentioning mime_type for urls sent in the request. Gemini 2.5 thinking parameters are now available.
+- **Perplexity**: Added support for response_format and search_recency_filter request parameters.
+- **Bedrock**: You can now pass the `anthropic_beta` parameter in Bedrock’s Anthropic API via Portkey to enable Claude's Computer use beta feature.
+
+**Tool calling**
+
+Portkey now supports tool calling for Groq, OpenRouter, and Ollama
+
+**New Providers**
+
+<CardGroup cols={2}>
+<Card title="Dashscope">
+Integrate with Dashscope 
+</Card>
+<Card title="Recraft AI">
+Generate production-ready visuals with Recraft
+</Card>
+</CardGroup>
+<CardGroup cols={2}>
+<Card title="Replicate">
+Run open-source models via simple APIs with Replicate
+</Card>
+<Card title="Azure AI Foundry">
+Access over 1,800 models with Azure AI Foundry
+</Card>
+</CardGroup>
+
+**Technical Improvements**
+
+- **Caching and Logging Unified Thinking Responses**: Unified thinking response (content_blocks) now logged and cached for stream responses. 
+- **Strict Metadata Enforcement**: The metalogging preference order now is `Workspace Default > API Key Default > Incoming Request`. This is provide better control to org admins and ensure values set by them are not overridden.
+- **Prompt render endpoint**: Previously only available via the control plane, the prompt render endpoint is now supported directly on the Gateway URL.
+- Default config in an API key can no longer be overridden.
+
+## New Models & Integrations
+
+<CardGroup cols={2}>
+<Card title="GPT-4.1">
+OpenAI’s new model for faster and improved responses
+</Card>
+<Card title="Gemini 2.5 Pro">
+Google's most advanced model
+</Card>
+<Card title="Gemini 2.5 Flash">
+Google's fast, coest-efficient thinking model
+</Card>
+<Card title="Llama 4">
+Meta's latest model via Fireworks, Together, and Groq
+</Card>
+<Card title="o1-pro">
+OpenAI's model for better reasoning and consistent answers
+</Card>
+<Card title="gpt-image-1">
+OpenAI's latest image generation capabilities
+</Card>
+<Card title="Qwen 3">
+Alibaba's latest model with hybrid reasoning 
+</Card>
+<Card title="Audio models">
+Access audio models via Groq 
+</Card>
+</CardGroup>
+
+## Guardrails
+
+- **Azure AI content safety**: Use Microsoft’s content filtering solution to moderate inputs and outputs across supported models.
+
+- **Exa Online Search**: You can now configure Exa Online Search as a Guardrail in Portkey to enable real-time, grounded search across the web before answering. This makes any LLM capable of handling current events or live queries without needing model retraining. 
+
+## Documentation
+
+<Card icon="book" title="Administration Docs" href="https://portkey.ai/docs/product/administration/enforcing-request-metadata" horizontal />
+
+We've made significant improvements to our documentation:
+
+- **Virtual keys access**:  Defining who can view and manage virtual keys within workspaces. [Learn more](https://portkey.ai/docs/product/administration/configure-virtual-key-access-permissions)
+- **API keys access**: Control how workspace managers and members interact with API keys within their workspaces. [Learn more](https://portkey.ai/docs/product/administration/configure-api-key-access-permissions)
+
+
+## Community
+
+Here's a tutorial on how to build a customer supporr agent using Langraph and Portkey. Shoutout to [Nerding I/O](https://www.youtube.com/@nerding_io)!!
+
+<iframe width="560" height="315" src="https://www.youtube.com/embed/6MgPd3O3FXs?si=3FOPtlbTd9ayGN-1" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+
+**Customer love!**
+| <img src="/images/changelog/testimonial.jpeg" width="100%"/> | <img src="/images/changelog/testimonial2.png" width="100%" /> |
+| :-- | :-- |
+
+**Partner blog**
+
+[See](https://portkey.ai/blog/securing-your-ai-via-ai-gateways/) how Portkey and Pillar together can help you build secure GenAI apps for production. 
+
+### Community Contributors
+
+A special thanks to our community contributors this month:
+- [unsync](https://github.com/unsync)
+- [francescov1](https://github.com/francescov1)
+- [Ajay Satish](https://github.com/Ajay-Satish-01)
+
+## Coming this month!
+
+We're changing how agents go to production, from first principles. [Watch out for this](https://x.com/PortkeyAI/status/1912491547653701891) 👀 
+
+## Support
+
+<CardGroup cols={2}>
+<Card title="Need Help?" icon="bug" href="https://github.com/Portkey-AI/gateway/issues">
+Open an issue on GitHub
+</Card>
+<Card title="Join Us" icon="discord" href="https://portkey.wiki/community">
+Get support in our Discord
+</Card>
+</CardGroup>
@@ -823,6 +823,7 @@
               {
                 "group": "2025",
                 "pages": [
+                  "changelog/2025/apr",
                   "changelog/2025/mar",
                   "changelog/2025/feb",
                   "changelog/2025/jan"
Original file line number	Diff line number	Diff line change
`@@ -823,6 +823,7 @@`
`823`	`823`	`{`
`824`	`824`	`"group": "2025",`
`825`	`825`	`"pages": [`
	`826`	`+ "changelog/2025/apr",`
`826`	`827`	`"changelog/2025/mar",`
`827`	`828`	`"changelog/2025/feb",`
`828`	`829`	`"changelog/2025/jan"`