Skip to content

Commit 300b216

Browse files
committed
Add TOOLS_ONLY caching strategy to Anthropic integration
This commit introduces the TOOLS_ONLY caching strategy, enabling scenarios where large tool definitions should be cached while system prompts remain dynamic and uncached. Key Changes: - Add TOOLS_ONLY enum to AnthropicCacheStrategy with comprehensive javadocs - Update CacheEligibilityResolver to support TOOLS_ONLY strategy - Enhance all strategy javadocs with detailed use cases and token guidance - Add comprehensive unit tests covering all caching scenarios - Update documentation with TOOLS_ONLY examples and cascade invalidation Use Cases: - Multi-tenant SaaS applications with shared tools but per-tenant system prompts - A/B testing scenarios with stable tools but variable system instructions - Applications with large tool sets (5000+ tokens) and dynamic contexts Technical Details: - TOOLS_ONLY uses 1 cache breakpoint on the last tool definition - System messages are NOT cached, processed fresh on each request - Supports Anthropic's cache hierarchy (tools → system → messages) - Compatible with cascade invalidation behavior Documentation: - Added strategy comparison table with breakpoint usage - Added multi-tenant SaaS use case example - Updated best practices with cascade invalidation explanation - Clarified SYSTEM_AND_TOOLS independence behavior Signed-off-by: Soby Chacko <[email protected]>
1 parent 56e2e0a commit 300b216

File tree

4 files changed

+422
-30
lines changed

4 files changed

+422
-30
lines changed

models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/api/AnthropicCacheStrategy.java

Lines changed: 78 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,31 +22,104 @@
2222
* system → messages.
2323
*
2424
* @author Mark Pollack
25+
* @author Soby Chacko
2526
* @since 1.1.0
2627
*/
2728
public enum AnthropicCacheStrategy {
2829

2930
/**
30-
* No caching (default behavior).
31+
* No caching (default behavior). All content is processed fresh on each request.
32+
* <p>
33+
* Use this when:
34+
* <ul>
35+
* <li>Requests are one-off or highly variable</li>
36+
* <li>Content doesn't meet minimum token requirements (1024+ tokens)</li>
37+
* <li>You want to avoid caching overhead</li>
38+
* </ul>
3139
*/
3240
NONE,
3341

42+
/**
43+
* Cache tool definitions only. Places a cache breakpoint on the last tool, while
44+
* system messages and conversation history remain uncached and are processed fresh on
45+
* each request.
46+
* <p>
47+
* Use this when:
48+
* <ul>
49+
* <li>Tool definitions are large and stable (5000+ tokens)</li>
50+
* <li>System prompts change frequently or are small (&lt;500 tokens)</li>
51+
* <li>You want to share cached tools across different system contexts (e.g.,
52+
* multi-tenant applications, A/B testing system prompts)</li>
53+
* <li>Tool definitions rarely change</li>
54+
* </ul>
55+
* <p>
56+
* <strong>Important:</strong> Changing any tool definition will invalidate this cache
57+
* entry. Due to Anthropic's cascade invalidation, tool changes will also invalidate
58+
* any downstream cache breakpoints (system, messages) if used in combination with
59+
* other strategies.
60+
*/
61+
TOOLS_ONLY,
62+
3463
/**
3564
* Cache system instructions only. Places a cache breakpoint on the system message
36-
* content.
65+
* content. Tools are cached implicitly via Anthropic's automatic ~20-block lookback
66+
* mechanism (content before the cache breakpoint is included in the cache).
67+
* <p>
68+
* Use this when:
69+
* <ul>
70+
* <li>System prompts are large and stable (1024+ tokens)</li>
71+
* <li>Tool definitions are relatively small (&lt;20 tools)</li>
72+
* <li>You want simple, single-breakpoint caching</li>
73+
* </ul>
74+
* <p>
75+
* <strong>Note:</strong> Changing tools will invalidate the cache since tools are
76+
* part of the cache prefix (they appear before system in the request hierarchy).
3777
*/
3878
SYSTEM_ONLY,
3979

4080
/**
4181
* Cache system instructions and tool definitions. Places cache breakpoints on the
42-
* last tool and system message content.
82+
* last tool (breakpoint 1) and system message content (breakpoint 2).
83+
* <p>
84+
* Use this when:
85+
* <ul>
86+
* <li>Both tools and system prompts are large and stable</li>
87+
* <li>You have many tools (20+ tools, beyond the automatic lookback window)</li>
88+
* <li>You want deterministic, explicit caching of both components</li>
89+
* <li>System prompts may change independently of tools</li>
90+
* </ul>
91+
* <p>
92+
* <strong>Behavior:</strong>
93+
* <ul>
94+
* <li>If only tools change: Both caches invalidated (tools + system)</li>
95+
* <li>If only system changes: Tools cache remains valid, system cache
96+
* invalidated</li>
97+
* </ul>
98+
* This allows efficient reuse of tool cache when only system prompts are updated.
4399
*/
44100
SYSTEM_AND_TOOLS,
45101

46102
/**
47103
* Cache the entire conversation history up to (but not including) the current user
48-
* question. This is ideal for multi-turn conversations where you want to reuse the
49-
* conversation context while asking new questions.
104+
* question. Places a cache breakpoint on the last user message in the conversation
105+
* history, enabling incremental caching as the conversation grows.
106+
* <p>
107+
* Use this when:
108+
* <ul>
109+
* <li>Building multi-turn conversational applications (chatbots, assistants)</li>
110+
* <li>Conversation history is large and grows over time</li>
111+
* <li>You want to reuse conversation context while asking new questions</li>
112+
* <li>Using chat memory advisors or conversation persistence</li>
113+
* </ul>
114+
* <p>
115+
* <strong>Behavior:</strong> Each turn builds on the previous cached prefix. The
116+
* cache grows incrementally: Request 1 caches [Message1], Request 2 caches [Message1
117+
* + Message2], etc. This provides significant cost savings (90%+) and performance
118+
* improvements for long conversations.
119+
* <p>
120+
* <strong>Important:</strong> Changing tools or system prompts will invalidate the
121+
* entire conversation cache due to cascade invalidation. Tool and system stability is
122+
* critical for this strategy.
50123
*/
51124
CONVERSATION_HISTORY
52125

models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/api/utils/CacheEligibilityResolver.java

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
* definition messages.
4040
*
4141
* @author Austin Dase
42+
* @author Soby Chacko
4243
* @since 1.1.0
4344
**/
4445
public class CacheEligibilityResolver {
@@ -84,6 +85,7 @@ private static Set<MessageType> extractEligibleMessageTypes(AnthropicCacheStrate
8485
return switch (anthropicCacheStrategy) {
8586
case NONE -> Set.of();
8687
case SYSTEM_ONLY, SYSTEM_AND_TOOLS -> Set.of(MessageType.SYSTEM);
88+
case TOOLS_ONLY -> Set.of(); // No message types cached, only tool definitions
8789
case CONVERSATION_HISTORY -> Set.of(MessageType.values());
8890
};
8991
}
@@ -108,10 +110,11 @@ public AnthropicApi.ChatCompletionRequest.CacheControl resolve(MessageType messa
108110
}
109111

110112
public AnthropicApi.ChatCompletionRequest.CacheControl resolveToolCacheControl() {
111-
// Tool definitions are only cache-eligible for SYSTEM_AND_TOOLS and
113+
// Tool definitions are cache-eligible for TOOLS_ONLY, SYSTEM_AND_TOOLS, and
112114
// CONVERSATION_HISTORY strategies. SYSTEM_ONLY caches only system messages,
113115
// relying on Anthropic's cache hierarchy to implicitly cache tools.
114-
if (this.cacheStrategy != AnthropicCacheStrategy.SYSTEM_AND_TOOLS
116+
if (this.cacheStrategy != AnthropicCacheStrategy.TOOLS_ONLY
117+
&& this.cacheStrategy != AnthropicCacheStrategy.SYSTEM_AND_TOOLS
115118
&& this.cacheStrategy != AnthropicCacheStrategy.CONVERSATION_HISTORY) {
116119
logger.debug("Caching not enabled for tool definition, cacheStrategy={}", this.cacheStrategy);
117120
return null;

models/spring-ai-anthropic/src/test/java/org/springframework/ai/anthropic/api/utils/CacheEligibilityResolverTests.java

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
* Tests for {@link CacheEligibilityResolver}.
3131
*
3232
* @author Austin Dase
33+
* @author Soby Chacko
3334
*/
3435
class CacheEligibilityResolverTests {
3536

@@ -85,6 +86,14 @@ void toolCacheControlRespectsStrategy() {
8586
.build());
8687
assertThat(sys.resolveToolCacheControl()).isNull();
8788

89+
// TOOLS_ONLY -> tool caching enabled, system messages NOT cached
90+
CacheEligibilityResolver toolsOnly = CacheEligibilityResolver.from(AnthropicCacheOptions.builder()
91+
.strategy(AnthropicCacheStrategy.TOOLS_ONLY)
92+
.messageTypeTtl(MessageType.SYSTEM, AnthropicCacheTtl.ONE_HOUR)
93+
.build());
94+
assertThat(toolsOnly.resolveToolCacheControl()).isNotNull();
95+
assertThat(toolsOnly.resolve(MessageType.SYSTEM, "Large system prompt text")).isNull();
96+
8897
// SYSTEM_AND_TOOLS -> tool caching enabled (uses SYSTEM TTL)
8998
CacheEligibilityResolver sysAndTools = CacheEligibilityResolver.from(AnthropicCacheOptions.builder()
9099
.strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS)
@@ -100,4 +109,185 @@ void toolCacheControlRespectsStrategy() {
100109
assertThat(history.resolveToolCacheControl()).isNotNull();
101110
}
102111

112+
@Test
113+
void toolsOnlyStrategyBehavior() {
114+
AnthropicCacheOptions options = AnthropicCacheOptions.builder()
115+
.strategy(AnthropicCacheStrategy.TOOLS_ONLY)
116+
.messageTypeMinContentLength(MessageType.SYSTEM, 100)
117+
.build();
118+
CacheEligibilityResolver resolver = CacheEligibilityResolver.from(options);
119+
120+
// Caching is enabled
121+
assertThat(resolver.isCachingEnabled()).isTrue();
122+
123+
// System messages should NOT be cached
124+
assertThat(resolver.resolve(MessageType.SYSTEM, "Large system prompt with plenty of content"))
125+
.as("System messages should not be cached with TOOLS_ONLY strategy")
126+
.isNull();
127+
128+
// User messages should NOT be cached
129+
assertThat(resolver.resolve(MessageType.USER, "User message content")).isNull();
130+
131+
// Assistant messages should NOT be cached
132+
assertThat(resolver.resolve(MessageType.ASSISTANT, "Assistant message content")).isNull();
133+
134+
// Tool messages should NOT be cached
135+
assertThat(resolver.resolve(MessageType.TOOL, "Tool result content")).isNull();
136+
137+
// Tool definitions SHOULD be cached
138+
AnthropicApi.ChatCompletionRequest.CacheControl toolCache = resolver.resolveToolCacheControl();
139+
assertThat(toolCache).as("Tool definitions should be cached with TOOLS_ONLY strategy").isNotNull();
140+
assertThat(toolCache.type()).isEqualTo("ephemeral");
141+
}
142+
143+
@Test
144+
void breakpointCountForEachStrategy() {
145+
// NONE: 0 breakpoints
146+
CacheEligibilityResolver none = CacheEligibilityResolver
147+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.NONE).build());
148+
assertThat(none.resolveToolCacheControl()).isNull();
149+
assertThat(none.resolve(MessageType.SYSTEM, "content")).isNull();
150+
151+
// SYSTEM_ONLY: 1 breakpoint (system only, tools implicit)
152+
CacheEligibilityResolver systemOnly = CacheEligibilityResolver
153+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.SYSTEM_ONLY).build());
154+
assertThat(systemOnly.resolveToolCacheControl()).as("SYSTEM_ONLY should not explicitly cache tools").isNull();
155+
assertThat(systemOnly.resolve(MessageType.SYSTEM, "content")).isNotNull();
156+
157+
// TOOLS_ONLY: 1 breakpoint (tools only)
158+
CacheEligibilityResolver toolsOnly = CacheEligibilityResolver
159+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.TOOLS_ONLY).build());
160+
assertThat(toolsOnly.resolveToolCacheControl()).as("TOOLS_ONLY should cache tools").isNotNull();
161+
assertThat(toolsOnly.resolve(MessageType.SYSTEM, "content")).as("TOOLS_ONLY should not cache system").isNull();
162+
163+
// SYSTEM_AND_TOOLS: 2 breakpoints (tools + system)
164+
CacheEligibilityResolver systemAndTools = CacheEligibilityResolver
165+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS).build());
166+
assertThat(systemAndTools.resolveToolCacheControl()).as("SYSTEM_AND_TOOLS should cache tools").isNotNull();
167+
assertThat(systemAndTools.resolve(MessageType.SYSTEM, "content")).as("SYSTEM_AND_TOOLS should cache system")
168+
.isNotNull();
169+
}
170+
171+
@Test
172+
void messageTypeEligibilityPerStrategy() {
173+
// NONE: No message types eligible
174+
CacheEligibilityResolver none = CacheEligibilityResolver
175+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.NONE).build());
176+
assertThat(none.resolve(MessageType.SYSTEM, "content")).isNull();
177+
assertThat(none.resolve(MessageType.USER, "content")).isNull();
178+
assertThat(none.resolve(MessageType.ASSISTANT, "content")).isNull();
179+
assertThat(none.resolve(MessageType.TOOL, "content")).isNull();
180+
181+
// SYSTEM_ONLY: Only SYSTEM eligible
182+
CacheEligibilityResolver systemOnly = CacheEligibilityResolver
183+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.SYSTEM_ONLY).build());
184+
assertThat(systemOnly.resolve(MessageType.SYSTEM, "content")).isNotNull();
185+
assertThat(systemOnly.resolve(MessageType.USER, "content")).isNull();
186+
assertThat(systemOnly.resolve(MessageType.ASSISTANT, "content")).isNull();
187+
assertThat(systemOnly.resolve(MessageType.TOOL, "content")).isNull();
188+
189+
// TOOLS_ONLY: No message types eligible (only tool definitions)
190+
CacheEligibilityResolver toolsOnly = CacheEligibilityResolver
191+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.TOOLS_ONLY).build());
192+
assertThat(toolsOnly.resolve(MessageType.SYSTEM, "content")).isNull();
193+
assertThat(toolsOnly.resolve(MessageType.USER, "content")).isNull();
194+
assertThat(toolsOnly.resolve(MessageType.ASSISTANT, "content")).isNull();
195+
assertThat(toolsOnly.resolve(MessageType.TOOL, "content")).isNull();
196+
197+
// SYSTEM_AND_TOOLS: Only SYSTEM eligible
198+
CacheEligibilityResolver systemAndTools = CacheEligibilityResolver
199+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS).build());
200+
assertThat(systemAndTools.resolve(MessageType.SYSTEM, "content")).isNotNull();
201+
assertThat(systemAndTools.resolve(MessageType.USER, "content")).isNull();
202+
assertThat(systemAndTools.resolve(MessageType.ASSISTANT, "content")).isNull();
203+
assertThat(systemAndTools.resolve(MessageType.TOOL, "content")).isNull();
204+
205+
// CONVERSATION_HISTORY: All message types eligible
206+
CacheEligibilityResolver history = CacheEligibilityResolver
207+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY).build());
208+
assertThat(history.resolve(MessageType.SYSTEM, "content")).isNotNull();
209+
assertThat(history.resolve(MessageType.USER, "content")).isNotNull();
210+
assertThat(history.resolve(MessageType.ASSISTANT, "content")).isNotNull();
211+
assertThat(history.resolve(MessageType.TOOL, "content")).isNotNull();
212+
}
213+
214+
@Test
215+
void toolsOnlyIsolationFromSystemChanges() {
216+
// Validates that TOOLS_ONLY resolver behavior is consistent
217+
// regardless of system message content (simulating different system prompts)
218+
CacheEligibilityResolver resolver = CacheEligibilityResolver
219+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.TOOLS_ONLY).build());
220+
221+
// Different system prompts should all be ineligible for caching
222+
assertThat(resolver.resolve(MessageType.SYSTEM, "You are a helpful assistant"))
223+
.as("System prompt 1 should not be cached")
224+
.isNull();
225+
assertThat(resolver.resolve(MessageType.SYSTEM, "You are a STRICT validator"))
226+
.as("System prompt 2 should not be cached")
227+
.isNull();
228+
assertThat(resolver.resolve(MessageType.SYSTEM, "You are a creative writer"))
229+
.as("System prompt 3 should not be cached")
230+
.isNull();
231+
232+
// Tool cache eligibility should remain consistent
233+
assertThat(resolver.resolveToolCacheControl()).as("Tools should always be cacheable").isNotNull();
234+
}
235+
236+
@Test
237+
void systemAndToolsIndependentBreakpoints() {
238+
// Validates that SYSTEM_AND_TOOLS creates two independent eligibility checks
239+
CacheEligibilityResolver resolver = CacheEligibilityResolver
240+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS).build());
241+
242+
// Both tools and system should be independently eligible
243+
AnthropicApi.ChatCompletionRequest.CacheControl toolCache = resolver.resolveToolCacheControl();
244+
AnthropicApi.ChatCompletionRequest.CacheControl systemCache = resolver.resolve(MessageType.SYSTEM, "content");
245+
246+
assertThat(toolCache).as("Tools should be cacheable").isNotNull();
247+
assertThat(systemCache).as("System should be cacheable").isNotNull();
248+
249+
// They should use the same TTL (both use SYSTEM message type TTL)
250+
assertThat(toolCache.ttl()).isEqualTo(systemCache.ttl());
251+
}
252+
253+
@Test
254+
void breakpointLimitEnforced() {
255+
AnthropicCacheOptions options = AnthropicCacheOptions.builder()
256+
.strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY)
257+
.build();
258+
CacheEligibilityResolver resolver = CacheEligibilityResolver.from(options);
259+
260+
// Use up breakpoints by resolving multiple times
261+
resolver.resolve(MessageType.SYSTEM, "content"); // Uses breakpoint 1
262+
resolver.useCacheBlock();
263+
resolver.resolve(MessageType.USER, "content"); // Uses breakpoint 2
264+
resolver.useCacheBlock();
265+
resolver.resolve(MessageType.ASSISTANT, "content"); // Uses breakpoint 3
266+
resolver.useCacheBlock();
267+
resolver.resolve(MessageType.TOOL, "content"); // Uses breakpoint 4
268+
resolver.useCacheBlock();
269+
270+
// 5th attempt should return null (all 4 breakpoints used)
271+
assertThat(resolver.resolve(MessageType.USER, "more content"))
272+
.as("Should return null when all 4 breakpoints are used")
273+
.isNull();
274+
}
275+
276+
@Test
277+
void emptyAndNullContentHandling() {
278+
CacheEligibilityResolver resolver = CacheEligibilityResolver
279+
.from(AnthropicCacheOptions.builder().strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY).build());
280+
281+
// Empty string should not be cached
282+
assertThat(resolver.resolve(MessageType.SYSTEM, "")).as("Empty string should not be cached").isNull();
283+
284+
// Null should not be cached
285+
assertThat(resolver.resolve(MessageType.SYSTEM, null)).as("Null content should not be cached").isNull();
286+
287+
// Whitespace-only should be cached if it meets length requirement
288+
assertThat(resolver.resolve(MessageType.SYSTEM, " "))
289+
.as("Whitespace-only content meeting length requirements should be cacheable")
290+
.isNotNull();
291+
}
292+
103293
}

0 commit comments

Comments
 (0)