Anthropic automatic caching with ChatAnthropicVertex throws exception

### Package (Required)

- [ ] langchain-google-genai
- [x] langchain-google-vertexai
- [ ] langchain-google-community
- [ ] Other / not sure / general

### Checked other resources

- [x] I added a descriptive title to this issue
- [x] I searched the LangChain documentation and API reference (linked above)
- [x] I used the GitHub search to find a similar issue and didn't find it
- [x] I am sure this is a bug and not a question or request for help

### Example Code (Python)

```python
import os

from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_google_vertexai.model_garden import ChatAnthropicVertex

load_dotenv()

project = os.getenv("GOOGLE_CLOUD_PROJECT")
location = os.getenv("GOOGLE_CLOUD_LOCATION")

# Name: langchain-google-vertexai
# Version: 3.2.2
# Name: anthropic
# Version: 0.83.0
# Name: langchain
# Version: 1.2.10
# Name: langchain-anthropic
# Version: 1.3.3
model = ChatAnthropicVertex(
    name="claude-sonnet-4-6",
    model_name="claude-sonnet-4-6",
    project=project,
    location=location,
    temperature=0.7,
    max_output_tokens=8192,
    max_retries=1,
    # additional_headers={"anthropic-beta": "extended-cache-ttl-2025-04-11"},
)

system_message = SystemMessage(content="You are a helpful assistant.")
human_message = HumanMessage(content="What is the capital of France?")
messages = [system_message, human_message]
print("Trying with ChatAnthropicVertex...")
try:
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
    print(response.text)
except Exception as e:
    print(f"Error during Anthropic Vertex model invocation: {e}")

# With Anthropic
model = ChatAnthropic(
    name="claude-sonnet-4-6",
    model_name="claude-sonnet-4-6",
    temperature=0.7,
    max_tokens=8192,
)
print("Trying with ChatAnthropic...")
try:
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
    print(response.text)
except Exception as e:
    print(f"Error during Anthropic model invocation: {e}")


"""
OUTPUT:
Trying with ChatAnthropicVertex...
Error during Anthropic Vertex model invocation: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'cache_control: Extra inputs are not permitted'}, 'request_id': 'req_vrtx_011CYSk27BXt381h7s6vN6HV'}
Trying with ChatAnthropic...
The capital of France is **Paris**.
"""
```

### Error Message and Stack Trace (if applicable)

```shell
Traceback (most recent call last):
  File "/Users/someuser/example-proj/test_bug_vertex.py", line 40, in <module>
    raise e
  File "/Users/someuser/example-proj/test_bug_vertex.py", line 37, in <module>
    response = model.invoke(messages, cache_control={"type": "ephemeral"})
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 402, in invoke
    self.generate_prompt(
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1123, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 933, in generate
    self._generate_with_cache(
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1235, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_google_vertexai/model_garden.py", line 343, in _generate
    data = _completion_with_retry_inner(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 331, in wrapped_f
    return copy(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 470, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 371, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 413, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 184, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/tenacity/__init__.py", line 473, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/langchain_google_vertexai/model_garden.py", line 341, in _completion_with_retry_inner
    return self.client.messages.create(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_utils/_utils.py", line 282, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/resources/messages/messages.py", line 996, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_base_client.py", line 1364, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/someuser/example-proj/.venv/lib/python3.11/site-packages/anthropic/_base_client.py", line 1137, in request
    raise self._make_status_error_from_response(err.response) from None
anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'cache_control: Extra inputs are not permitted'}, 'request_id': 'req_vrtx_011CYSk5seYzVnbvX4vhA5Ky'}
```

### Description

I'm trying to use Anthropic's automatic prompt caching. 
According to Langchain, I can simply add a "cache_control" extra kwarg when invoking the model. 
This works when using ChatAnthropic from langchain-anthropic, but raises an error when using ChatAnthropicVertex. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anthropic automatic caching with ChatAnthropicVertex throws exception #1612

Package (Required)

Checked other resources

Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Anthropic automatic caching with ChatAnthropicVertex throws exception #1612

Description

Package (Required)

Checked other resources

Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions