You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For more information about creating and managing application inference profiles, see the [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-create.html).
223
224
225
+
### Prompt Caching
226
+
227
+
This proxy now supports **Prompt Caching** for Claude and Nova models, which can reduce costs by up to 90% and latency by up to 85% for workloads with repeated prompts.
228
+
229
+
**Supported Models:**
230
+
- Claude 3+ models (Claude 3.5 Haiku, Claude 3.7 Sonnet, Claude 4, Claude 4.5, etc.)
231
+
- Nova models (Nova Micro, Nova Lite, Nova Pro, Nova Premier)
232
+
233
+
**Enabling Prompt Caching:**
234
+
235
+
You can enable prompt caching in two ways:
236
+
237
+
1.**Globally via Environment Variable** (set in ECS Task Definition or Lambda):
0 commit comments