Structured Outputs with Prompt Caching #1403

poccio · 2025-03-13T15:14:18Z

poccio
Mar 13, 2025

Hey there,

I feel like I might be missing a critical piece of the puzzle when working with long contexts, multiple agents, and structured outputs, so I wanted to bring this up.

Here's the situation: assume I have a very large document (>30k tokens) that needs to be provided as context to multiple agents. Each agent performs a specialized task and uses Instructor to produce a structured output. Due to the nature of the tasks, all agents need access to the entire document, so typical RAG approaches aren’t an option.

Prompt caching feels like the perfect fit here—providing me with faster and cheaper agent interactions. But there's a problem: each agent uses a different tool, and as implemented by the major providers (OpenAI, Anthropic, Gemini), prompt caching wouldn’t work in this setup.

The issue stems from how Instructor (and, as a matter of fact, most libraries and the providers themselves) enforces structured outputs: either via TOOLS mode (tool calling) or JSON mode (placing the schema in the system prompt). The way prompt caching is implemented by these providers is fundamentally prefix-based. And since the tools and system prompts are part of that prefix (as a matter of fact, at the very beginning), any variation in tools or system prompt content completely breaks caching.

This feels to me like a very common use case, and haven't found much discussion about it. The only straightforward solution I see is giving up on the stronger structure guarantees of tool calling and instead instructing the model via user messages to produce structured outputs (something like plain JSON and MD_JSON). However - correct me if I am wrong - Instructor doesn't seem to support this kind of solution out of the box for these providers: for instance, considering Anthropic, ANTHROPIC_TOOLS uses tools whereas ANTHROPIC_JSON places the schema in the system prompt.

Am I missing something here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Structured Outputs with Prompt Caching #1403

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Structured Outputs with Prompt Caching #1403

Uh oh!

poccio Mar 13, 2025

Replies: 0 comments

poccio
Mar 13, 2025