Skip to content

Anthropic automatic caching with ChatAnthropicVertex throws exceptionΒ #1612

@henriquejsb

Description

Package (Required)

  • langchain-google-genai
  • langchain-google-vertexai
  • langchain-google-community
  • Other / not sure / general

Checked other resources

  • I added a descriptive title to this issue
  • I searched the LangChain documentation and API reference (linked above)
  • I used the GitHub search to find a similar issue and didn't find it
  • I am sure this is a bug and not a question or request for help

Example Code (Python)

import os

from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_google_vertexai.model_garden import ChatAnthropicVertex

load_dotenv()

project = os.getenv("GOOGLE_CLOUD_PROJECT")
location = os.getenv("GOOGLE_CLOUD_LOCATION")

# Name: langchain-google-vertexai
# Version: 3.2.2
# Name: anthropic
# Version: 0.83.0
# Name: langchain
# Version: 1.2.10
# Name: langchain-anthropic
# Version: 1.3.3
model = ChatAnthropicVertex(
    name="claude-sonnet-4-6",
    model_name="claude-sonnet-4-6",
    project=project,
    location=location,
    temperature=0.7,
    max_output_tokens=8192,
    max_retries=1,
    # additional_headers={"anthropic-beta": "extended-cache-ttl-2025-04-11"},
)

system_message = SystemMessage(content="You are a helpful assistant.")
human_message = HumanMessage(content="What is the capital of France?")
messages = [system_message, human_message]
print("Trying with ChatAnthropicVertex...")
try:
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
    print(response.text)
except Exception as e:
    print(f"Error during Anthropic Vertex model invocation: {e}")

# With Anthropic
model = ChatAnthropic(
    name="claude-sonnet-4-6",
    model_name="claude-sonnet-4-6",
    temperature=0.7,
    max_tokens=8192,
)
print("Trying with ChatAnthropic...")
try:
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
    print(response.text)
except Exception as e:
    print(f"Error during Anthropic model invocation: {e}")


"""
OUTPUT:
Trying with ChatAnthropicVertex...
Error during Anthropic Vertex model invocation: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'cache_control: Extra inputs are not permitted'}, 'request_id': 'req_vrtx_011CYSk27BXt381h7s6vN6HV'}
Trying with ChatAnthropic...
The capital of France is **Paris**.
"""

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "/Users/someuser/example-proj/test_bug_vertex.py", line 40, in <module>
    raise e
  File "/Users/someuser/example-proj/test_bug_vertex.py", line 37, in <module>
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 402, in invoke
    self.generate_prompt(
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1123, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 933, in generate
    self._generate_with_cache(
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1235, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_google_vertexai/model_garden.py", line 343, in _generate
    data = _completion_with_retry_inner(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 331, in wrapped_f
    return copy(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_google_vertexai/model_garden.py", line 341, in _completion_with_retry_inner
    return self.client.messages.create(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_utils/_utils.py", line 282, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/resources/messages/messages.py", line 996, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_base_client.py", line 1364, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_base_client.py", line 1137, in request
    raise self._make_status_error_from_response(err.response) from None
anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'cache_control: Extra inputs are not permitted'}, 'request_id': 'req_vrtx_011CYSk5seYzVnbvX4vhA5Ky'}

Description

I'm trying to use Anthropic's automatic prompt caching.
According to Langchain, I can simply add a "cache_control" extra kwarg when invoking the model.
This works when using ChatAnthropic from langchain-anthropic, but raises an error when using ChatAnthropicVertex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingvertexaiGenerative AI on Google Cloud's Vertex AI Platform

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions