Skip to content

cache_control and providerOptions stripped from messages — prompt caching broken for Anthropic #23

@03-CiprianoG

Description

@03-CiprianoG

Description

The HeliconeLanguageModel's message conversion function strips cache_control and providerOptions from all messages when building the request body for the Helicone gateway. This means Anthropic prompt caching never works through the AI SDK provider, even though the Helicone gateway itself supports it (as documented at https://docs.helicone.ai/gateway/concepts/prompt-caching).

Reproduction

import { createHelicone } from '@helicone/ai-sdk-provider';
import { generateText } from 'ai';

const helicone = createHelicone({ apiKey: process.env.HELICONE_API_KEY });

const result = await generateText({
  model: helicone('claude-4.6-sonnet/anthropic'),
  messages: [
    {
      role: 'system',
      content: 'You are a helpful assistant.',
      providerOptions: {
        anthropic: { cacheControl: { type: 'ephemeral' } },
      },
    },
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'Very long context that should be cached...',
          providerOptions: {
            anthropic: { cacheControl: { type: 'ephemeral' } },
          },
        },
        { type: 'text', text: 'What is this about?' },
      ],
    },
  ],
});

Expected: The gateway receives cache_control directives on messages/content blocks and forwards them to Anthropic, enabling prompt caching.

Actual: The message converter drops all providerOptions and cache_control. The gateway receives plain messages with no caching directives. Every request pays full input token cost — no cache reads ever occur.

Root Cause

The message conversion function (minified as Bt in the built output) only copies a subset of fields:

System messages — only copies { role, content }, drops message-level providerOptions:

case "system":
  e.push({ role: "system", content: t.content });
  break;

User text content blocks — only copies { type, text }, drops block-level providerOptions and cache_control:

case "text":
  return { type: "text", text: o.text };

Fix

The converter should preserve cache_control and/or providerOptions on both messages and content blocks. Since the Helicone gateway translates to the OpenAI-compatible format with cache_control extensions (via @helicone/helpers), the fix should map providerOptions.anthropic.cacheControl to the cache_control field that the gateway already supports:

For system messages:

case "system": {
  const msg = { role: "system", content: t.content };
  if (t.providerOptions?.anthropic?.cacheControl) {
    msg.cache_control = t.providerOptions.anthropic.cacheControl;
  }
  e.push(msg);
  break;
}

For user text content blocks:

case "text": {
  const block = { type: "text", text: o.text };
  if (o.providerOptions?.anthropic?.cacheControl) {
    block.cache_control = o.providerOptions.anthropic.cacheControl;
  }
  return block;
}

This would align the AI SDK provider with the gateway's existing cache_control support documented at https://docs.helicone.ai/gateway/concepts/prompt-caching.

Impact

Without this fix, all Anthropic requests through the AI SDK provider pay full input token cost on every request. For applications making sequential requests with shared context (agents, pipelines, multi-turn conversations), this can result in 5-10x higher costs than necessary.

Environment

  • @helicone/ai-sdk-provider: 1.0.12
  • ai (Vercel AI SDK): 6.0.6
  • Model: claude-4.6-sonnet/anthropic through ai-gateway.helicone.ai

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions