Skip to content

Commit aafb412

Browse files
feat(examples): add prompt caching test script
Standalone TypeScript script (zero deps, native fetch) to verify Anthropic prompt caching is active. Documents 4 production gotchas not in official docs: beta header required for Claude 4.x, effective threshold ~2048+ tokens, cached tokens excluded from input_tokens, new nested cache_creation response format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent b6ce1ef commit aafb412

File tree

3 files changed

+145
-0
lines changed

3 files changed

+145
-0
lines changed

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,18 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
88

99
### Added
1010

11+
- **`examples/scripts/test-prompt-caching.ts`** — Standalone TypeScript script (zero deps, native fetch) to verify Anthropic prompt caching is active on any API key. Runs 3 identical calls and checks write/read metrics. Documents 4 production gotchas not in official docs: (1) `anthropic-beta: prompt-caching-2024-07-31` header is required even for Claude 4.x, (2) effective token threshold for Claude 4.x is ~2048+ not the documented 1024, (3) cached tokens are excluded from `input_tokens`, (4) new nested `cache_creation` object format with `ephemeral_5m_input_tokens` and `ephemeral_1h_input_tokens`. Usage: `ANTHROPIC_API_KEY=sk-ant-... npx tsx test-prompt-caching.ts`.
12+
13+
- **`CLAUDE.md` Behavioral Rules section** — New `## Behavioral Rules` section with 5 rules derived from observed session friction patterns (via `/insights` analysis): (1) always update `CHANGELOG.md` after any modification, (2) be exhaustive on first pass for audits and reviews, (3) use absolute paths in reports and documentation, (4) closing checklist confirming files changed + changelog + commit hash, (5) bias toward action — no extended planning loops without deliverables.
14+
15+
### Added
16+
17+
- **Desloppify tool** — New subsection "Fighting Vibe Code Degradation" in §9.8 (Vibe Coding). Documents `desloppify` ([peteromallet/desloppify](https://github.com/peteromallet/desloppify)), a community tool that installs a fix-loop workflow directly into Claude Code as a skill (`desloppify update-skill claude`) and runs a scan → next → fix → resolve loop to systematically improve code quality. Includes install snippet, the loop commands, and an early-stage status note with token cost caveat. Tagged early-stage (released February 2026, ~2K stars, no production-scale feedback yet).
18+
19+
- **`guide/workflows/github-actions.md`** — New workflow guide (5 production-ready patterns for GitHub Actions CI/CD with `anthropics/claude-code-action`, 6.2k stars, v1.0). Covers: (1) interactive PR review via `@claude` mention, (2) automatic review on push, (3) issue triage and labeling, (4) security-focused review triggered on sensitive paths (`auth/**`, `payments/**`), (5) scheduled weekly repo health check. Includes cost control table (Haiku vs Sonnet per pattern), concurrency setup to prevent parallel runs, fork safety guard for public repos, and Bedrock/Vertex authentication alternatives. Cross-linked from section 9.3 of the main guide and added to `guide/workflows/README.md`.
20+
21+
- **`guide/workflows/README.md`**: Added GitHub Actions Workflows entry to Development Workflows section with description, key topics, and "when to use" guidance.
22+
1123
- **`guide/workflows/rpi.md`** — New workflow guide (RPI: Research → Plan → Implement). 3-phase feature development pattern with explicit validation gates: Research produces `RESEARCH.md`, Plan produces `PLAN.md`, Implement produces working code. Each gate requires explicit GO before the next phase. Includes slash command templates (`/rpi:research`, `/rpi:plan`, `/rpi:implement`), a worked example (adding rate limiting to an Express API), and comparison matrix vs Plan-Driven, TDD, and Spec-First. Best for features where discovering a wrong assumption late is expensive.
1224

1325
- **`guide/workflows/changelog-fragments.md`** — New workflow guide for the Changelog Fragments pattern: one YAML fragment per PR, written at implementation time, validated by CI, assembled automatically at release. Covers 3-layer enforcement: (1) CLAUDE.md workflow rule for autonomous fragment creation, (2) `UserPromptSubmit` hook with 3-tier priority (enforcement → discovery → contextual), (3) independent CI migration check job. Includes the `UserPromptSubmit` tier pattern as a reusable hook architecture for any mandatory workflow step.

examples/scripts/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Utility scripts for Claude Code power users.
2626
| `rtk-benchmark.sh` | Benchmark RTK token savings vs raw commands |
2727
| `sync-claude-config.sh` | Sync Claude config files across machines |
2828
| `sonnetplan.sh` | Run Claude with Sonnet replacing Opus (cost optimization alias) |
29+
| `test-prompt-caching.ts` | Verify Anthropic prompt caching is active (no deps, fetch only) |
2930

3031
---
3132

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
/**
2+
* Prompt Caching Test — Anthropic API
3+
*
4+
* Verifies that prompt caching is active on your Anthropic API key.
5+
* Runs 3 identical calls and checks that calls 2-3 read from cache.
6+
*
7+
* Usage:
8+
* ANTHROPIC_API_KEY=sk-ant-... npx tsx test-prompt-caching.ts
9+
*
10+
* Requirements:
11+
* - Node 18+ (native fetch)
12+
* - No dependencies required
13+
*
14+
* Gotchas discovered in production (not in official docs):
15+
* 1. The `anthropic-beta: prompt-caching-2024-07-31` header is required
16+
* even for Claude 4.x models — omitting it silently disables caching.
17+
* 2. The effective token threshold for Claude 4.x is ~2048+, not the
18+
* documented 1024. Blocks below this threshold return write=0 with no warning.
19+
* 3. Cached tokens are excluded from `input_tokens` in the response.
20+
* Track `cache_creation_input_tokens` and `cache_read_input_tokens` instead.
21+
* 4. The new API format exposes a nested `cache_creation` object with
22+
* `ephemeral_5m_input_tokens` (5-min TTL) and `ephemeral_1h_input_tokens` (1-hour TTL).
23+
*/
24+
25+
const API_KEY = process.env.ANTHROPIC_API_KEY;
26+
if (!API_KEY) {
27+
console.error("Error: ANTHROPIC_API_KEY environment variable is not set.");
28+
process.exit(1);
29+
}
30+
31+
// Stable system prompt — must exceed the effective cache threshold (~2048 tokens for Claude 4.x).
32+
// In real usage: consolidate agent rules, platform context, and static instructions into one block.
33+
const STABLE_SYSTEM_PROMPT = `
34+
You are a helpful assistant.
35+
36+
Core rules:
37+
- Be concise and accurate
38+
- Cite sources when referencing external content
39+
- Match the language of the user
40+
41+
Extended context (padding to exceed cache threshold):
42+
${Array(200).fill("This platform helps users collaborate on tasks, track progress, and organize their workflow efficiently.").join(" ")}
43+
`.trim();
44+
45+
type Usage = {
46+
input_tokens: number;
47+
output_tokens: number;
48+
cache_creation_input_tokens?: number;
49+
cache_read_input_tokens?: number;
50+
cache_creation?: {
51+
ephemeral_5m_input_tokens?: number;
52+
ephemeral_1h_input_tokens?: number;
53+
};
54+
};
55+
56+
async function callAPI(callIndex: number): Promise<Usage> {
57+
const response = await fetch("https://api.anthropic.com/v1/messages", {
58+
method: "POST",
59+
headers: {
60+
"x-api-key": API_KEY!,
61+
"anthropic-version": "2023-06-01",
62+
// Required for all models, including Claude 4.x
63+
"anthropic-beta": "prompt-caching-2024-07-31",
64+
"content-type": "application/json",
65+
},
66+
body: JSON.stringify({
67+
model: "claude-sonnet-4-6",
68+
max_tokens: 50,
69+
system: [
70+
{
71+
type: "text",
72+
text: STABLE_SYSTEM_PROMPT,
73+
// Mark this block for caching — must be the last or only system block
74+
cache_control: { type: "ephemeral" },
75+
},
76+
],
77+
messages: [
78+
{ role: "user", content: `Test call ${callIndex} — reply OK` },
79+
],
80+
}),
81+
});
82+
83+
if (!response.ok) {
84+
const body = await response.text();
85+
throw new Error(`API ${response.status}: ${body}`);
86+
}
87+
88+
const data = (await response.json()) as { usage: Usage };
89+
return data.usage;
90+
}
91+
92+
async function main() {
93+
console.log("Prompt Caching Test\n");
94+
95+
let passed = true;
96+
97+
for (let i = 1; i <= 3; i++) {
98+
const usage = await callAPI(i);
99+
100+
const write = usage.cache_creation_input_tokens ?? 0;
101+
const read = usage.cache_read_input_tokens ?? 0;
102+
103+
let status: string;
104+
if (i === 1) {
105+
status = write > 0 ? "cache written" : "❌ write failed (check token threshold)";
106+
if (write === 0) passed = false;
107+
} else {
108+
status = read > 0 ? "✅ cache hit" : "❌ cache miss";
109+
if (read === 0) passed = false;
110+
}
111+
112+
console.log(
113+
`Call ${i}: write=${String(write).padStart(5)} read=${String(read).padStart(5)} input=${String(usage.input_tokens).padStart(4)}${status}`
114+
);
115+
}
116+
117+
console.log();
118+
if (passed) {
119+
console.log("Result: caching is working correctly.");
120+
} else {
121+
console.log("Result: caching is NOT working. Common causes:");
122+
console.log(" - Token threshold not met (try increasing STABLE_SYSTEM_PROMPT)");
123+
console.log(" - Missing beta header (anthropic-beta: prompt-caching-2024-07-31)");
124+
console.log(" - Account tier does not support caching");
125+
process.exit(1);
126+
}
127+
}
128+
129+
main().catch((err) => {
130+
console.error(err.message);
131+
process.exit(1);
132+
});

0 commit comments

Comments
 (0)