Skip to content

fix(google): strip media from functionResponse, preserve as sibling parts#10417

Open
pawel-twardziak wants to merge 10 commits intolangchain-ai:mainfrom
pawel-twardziak:fix/langchain-google-missing-ci-tests
Open

fix(google): strip media from functionResponse, preserve as sibling parts#10417
pawel-twardziak wants to merge 10 commits intolangchain-ai:mainfrom
pawel-twardziak:fix/langchain-google-missing-ci-tests

Conversation

@pawel-twardziak
Copy link
Contributor

@pawel-twardziak pawel-twardziak commented Mar 14, 2026

When a ToolMessage contains multimodal content (text + images via image_url blocks), the converter serializes inlineData/fileData inside functionResponse.response.result as JSON text. Gemini cannot interpret images nested inside a JSON string - they become invisible to the model.

This PR fixes both the legacy and v1 converter paths to:

  1. Exclude media items (image_url, media, image, audio, video, file) from functionResponse.response.result, keeping it text-only
  2. Preserve media as sibling parts (inlineData/fileData) alongside functionResponse, where Gemini can interpret them natively
  3. Filter out redundant text parts from tool responses - their content is already in functionResponse.response.result

Also adds @langchain/google to the CI unit test matrix - it was missing from the PACKAGES list in unit-tests-integrations.yml, so its tests were never run in CI.

#10415 - this PR must be merged first - it has some tests fixed that previously couldn't even get run in CI/CD - UPDATE: this is not valid any longer

Fixes #10297

@changeset-bot
Copy link

changeset-bot bot commented Mar 14, 2026

🦋 Changeset detected

Latest commit: 29ebf67

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 6 packages
Name Type
@langchain/google-common Patch
@langchain/google-gauth Patch
@langchain/google-vertexai-web Patch
@langchain/google-vertexai Patch
@langchain/google-webauth Patch
@langchain/google-genai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added the ci label Mar 14, 2026
@afirstenberg
Copy link
Contributor

pawel-twardziak - Thank you for this!

This addresses approach 2 from #10297, right?
Do we know if this works on Gemini 2.5?

And do we have an integration test case that illustrates the issue?

@pawel-twardziak
Copy link
Contributor Author

hi, thanks for reviewing, Allen Firstenberg (@afirstenberg) !

as far as I understood - this actually implements approach 1 (sibling parts) from #10297 - inlineData/fileData are extracted as separate Part objects alongside functionResponse. This is the approach Antonio Pinto (@Toze) confirmed works across model versions in their local patch.

Regarding Gemini 2.5: yes, the sibling parts approach works on Gemini 2.5 (and 3). Since Content.parts is repeated Part, placing functionResponse and inlineData in separate Part objects is structurally valid per the protobuf definition.

For approach 2 (FunctionResponse.parts for Gemini 3+), that could be a follow-up PR - per your earlier suggestion of supporting "both" with a model version check. Would you like me to tackle that as well?

Regarding integration tests - yes, we can add one. The existing suite in index.int.test.ts already has both tool conversation tests (e.g. "function conversation") and image tests (using blue-square.png). A test for this fix would combine them:

  1. Define a tool (e.g. screenshot_tool)
  2. Build a message history with a ToolMessage containing [{type: "text", text: "screenshot"}, {type: "image_url", image_url: {url: "data:image/png;base64,..."}}] using the existing blue-square.png
  3. Invoke the model and assert the response mentions "blue" - proving the model visually interpreted the image rather than seeing serialized base64 text

@pawel-twardziak
Copy link
Contributor Author

the branch name is wrong 🤷‍♂️

@afirstenberg
Copy link
Contributor

pawel-twardziak

as far as I understood - this actually implements approach 1

Ah, ok. I hadn't had a chance to look closely at everything yet.

For approach 2 (FunctionResponse.parts for Gemini 3+), that could be a follow-up PR - per your earlier suggestion of supporting "both" with a model version check. Would you like me to tackle that as well?

That would be fantastic - thank you! A model check plus a way to override would be great.

I think you're right that doing it as a second PR would be best. Let's get this in there, but plan to have an update coming.

(This is also why I'd like to see an integration test.)

Regarding integration tests - yes, we can add one.

Very much appreciated. I didn't think it would be hard to do, but wanted to make sure I understood the problem before we fixed it. Also helps to make sure we dont' break things in the future.

If you can add the test, I can go run it in the pre-patch environment to give it a try.

Thanks again!

@pawel-twardziak
Copy link
Contributor Author

Alright, thanks for your input Allen Firstenberg (@afirstenberg)

the plan is:

  • within this PR I'll add some integration tests
  • the approach 2 will be addressed in the next follow-up pr

@pawel-twardziak
Copy link
Contributor Author

pawel-twardziak commented Mar 15, 2026

hi Allen Firstenberg (@afirstenberg) I just sent over the tests - the PR should be ready for your review

@pawel-twardziak
Copy link
Contributor Author

pawel-twardziak commented Mar 15, 2026

Blocked by #10415 - UPDATE: not valid any longer

@pawel-twardziak pawel-twardziak force-pushed the fix/langchain-google-missing-ci-tests branch from 956a8b3 to ee36c44 Compare March 15, 2026 23:50
@afirstenberg
Copy link
Contributor

I'm confused why this is blocked by #10415. Just for the tests? (Can we just move the tests to this one?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to look more closely at this, but here are some initial thoughts and comments.

});

test("function reply", async () => {
// Thinking models require thought_signature on functionCall parts,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this addition. Have we seen issues along these lines with this test? (I haven't before, I don't think.) And is this not a "real model invocation"?

Copy link
Contributor Author

@pawel-twardziak pawel-twardziak Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this issue:

RequestError: Function call is missing a thought_signature in functionCall parts. This is required for tools to work correctly, and missing thought_signature may lead to degraded model performance. Additional data, function call default_api:test , position 2. Please refer to https://ai.google.dev/gemini-api/docs/thought-signatures for more details.

This is a real model invocation

if (model.startsWith("gemini-2.0-flash")) {
return;
}
// Not available on Gemini 3 models

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is backwards compatibility which maps it to the Google Search tool. So while Google doesn't support it - we do.
If this test fails - then something is broken with the backwards compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of the tests fail - checked on local

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it fails when I remove it:

AssertionError: expected { tokenUsage: { …(5) }, …(11) } to have property "url_context_metadata"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake - I thought this was the search tool, not URL Context tool.
And... this used to pass.
https://ai.google.dev/gemini-api/docs/url-context says it is supported.

If this is failing - something else is going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the limitations have something to do with it https://ai.google.dev/gemini-api/docs/url-context?hl=en#limitations ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gemini-3-flash-preview IS in the supported models list. So it should work but doesn't return url_context_metadata.

That's likely a preview model quirk - it's documented as supported but the response doesn't include the metadata consistently. I added a comment and a warning about it.


expect(result.text as string).toMatch(/(1 + 1 (equals|is|=) )?2.? ?/);
// With includeThoughts: true, response may have multiple parts (reasoning + text)
// Thought signatures are best-effort per Google docs; only assert when

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation? (It seems like we run the risk of a false failure if we only test for thought signatures when we have thought signatures.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will check it again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow this comment. Could you rephrase it please?

role = "function";
// Tool messages in Gemini are represented as function responses.
// Recent Gemini API versions require 'user' or 'model' role for all contents.
role = "user";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #10400

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will do

);
parts.push({
functionResponse: {
id: message.tool_call_id,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where there is an ID that needs to be returned. This should not just be removed.
See #10292

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dozens of integration tests fail if I don't remove it. Will check it again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequestError: Invalid JSON payload received. Unknown name "id" at 'contents[2].parts[1].function_response': Cannot find field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no id for Vertex AI https://docs.cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1/Content#FunctionResponse

and for Google AI it's optional and kinda useless https://ai.google.dev/api/caching?authuser=1&hl=pl#FunctionResponse

we have GooglePlatformType = "gai" | "gcp" so we could conditionally include id. But to do so, we'd need to thread platformType through the entire converter call chain - a fairly invasive change for a field that:

  • doesn't exist on Vertex AI at all
  • gets rejected by Gemini 3 preview models even on Google AI

The pragmatic call for now: just remove id. It's not usable on either platform in practice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had use cases where ID was sent and, since it wasn't returned, it was rejected. I, honestly, have no idea what the situations were.

This is handled in #10292.

Copy link
Contributor Author

@pawel-twardziak pawel-twardziak Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so then we might need the conditional way of handling it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which #10292 does.

Copy link
Contributor Author

@pawel-twardziak pawel-twardziak Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho we should skip it for now since all tests are good without this id param - unless we know the exact case when it's required

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. It's in #10292 🙂

@pawel-twardziak
Copy link
Contributor Author

Allen Firstenberg (@afirstenberg) thanks for you detailed review! Really appreciated 💪
fyi, all the integration tests pass in this PR's current state - I've checked it on local a few times

@pawel-twardziak
Copy link
Contributor Author

I'm confused why this is blocked by #10415. Just for the tests? (Can we just move the tests to this one?)

that PR was first fixing the "function -> user" change - but I don't see that change there now, strange... I must have mixed something up. Will add the change here in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Images in ToolMessage content lost when converted to Gemini functionResponse

2 participants