fix(google): strip media from functionResponse, preserve as sibling parts#10417
fix(google): strip media from functionResponse, preserve as sibling parts#10417pawel-twardziak wants to merge 10 commits intolangchain-ai:mainfrom
Conversation
🦋 Changeset detectedLatest commit: 29ebf67 The changes in this PR will be included in the next version bump. This PR includes changesets to release 6 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
pawel-twardziak - Thank you for this! This addresses approach 2 from #10297, right? And do we have an integration test case that illustrates the issue? |
|
hi, thanks for reviewing, Allen Firstenberg (@afirstenberg) ! as far as I understood - this actually implements approach 1 (sibling parts) from #10297 - inlineData/fileData are extracted as separate Part objects alongside functionResponse. This is the approach Antonio Pinto (@Toze) confirmed works across model versions in their local patch. Regarding Gemini 2.5: yes, the sibling parts approach works on Gemini 2.5 (and 3). Since Content.parts is repeated Part, placing functionResponse and inlineData in separate Part objects is structurally valid per the protobuf definition. For approach 2 (FunctionResponse.parts for Gemini 3+), that could be a follow-up PR - per your earlier suggestion of supporting "both" with a model version check. Would you like me to tackle that as well? Regarding integration tests - yes, we can add one. The existing suite in index.int.test.ts already has both tool conversation tests (e.g. "function conversation") and image tests (using blue-square.png). A test for this fix would combine them:
|
|
the branch name is wrong 🤷♂️ |
Ah, ok. I hadn't had a chance to look closely at everything yet.
That would be fantastic - thank you! A model check plus a way to override would be great. I think you're right that doing it as a second PR would be best. Let's get this in there, but plan to have an update coming. (This is also why I'd like to see an integration test.)
Very much appreciated. I didn't think it would be hard to do, but wanted to make sure I understood the problem before we fixed it. Also helps to make sure we dont' break things in the future. If you can add the test, I can go run it in the pre-patch environment to give it a try. Thanks again! |
|
Alright, thanks for your input Allen Firstenberg (@afirstenberg) the plan is:
|
|
hi Allen Firstenberg (@afirstenberg) I just sent over the tests - the PR should be ready for your review |
|
Blocked by #10415 - UPDATE: not valid any longer |
956a8b3 to
ee36c44
Compare
|
I'm confused why this is blocked by #10415. Just for the tests? (Can we just move the tests to this one?) |
Allen Firstenberg (afirstenberg)
left a comment
There was a problem hiding this comment.
I need to look more closely at this, but here are some initial thoughts and comments.
| }); | ||
|
|
||
| test("function reply", async () => { | ||
| // Thinking models require thought_signature on functionCall parts, |
There was a problem hiding this comment.
I'm not sure I understand this addition. Have we seen issues along these lines with this test? (I haven't before, I don't think.) And is this not a "real model invocation"?
There was a problem hiding this comment.
Yes, this issue:
RequestError: Function call is missing a thought_signature in functionCall parts. This is required for tools to work correctly, and missing thought_signature may lead to degraded model performance. Additional data, function call
default_api:test, position 2. Please refer to https://ai.google.dev/gemini-api/docs/thought-signatures for more details.
This is a real model invocation
| if (model.startsWith("gemini-2.0-flash")) { | ||
| return; | ||
| } | ||
| // Not available on Gemini 3 models |
There was a problem hiding this comment.
There is backwards compatibility which maps it to the Google Search tool. So while Google doesn't support it - we do.
If this test fails - then something is broken with the backwards compatibility.
There was a problem hiding this comment.
none of the tests fail - checked on local
There was a problem hiding this comment.
yes, it fails when I remove it:
AssertionError: expected { tokenUsage: { …(5) }, …(11) } to have property "url_context_metadata"
There was a problem hiding this comment.
My mistake - I thought this was the search tool, not URL Context tool.
And... this used to pass.
https://ai.google.dev/gemini-api/docs/url-context says it is supported.
If this is failing - something else is going on.
There was a problem hiding this comment.
do the limitations have something to do with it https://ai.google.dev/gemini-api/docs/url-context?hl=en#limitations ?
There was a problem hiding this comment.
gemini-3-flash-preview IS in the supported models list. So it should work but doesn't return url_context_metadata.
That's likely a preview model quirk - it's documented as supported but the response doesn't include the metadata consistently. I added a comment and a warning about it.
libs/providers/langchain-google/src/chat_models/tests/index.int.test.ts
Outdated
Show resolved
Hide resolved
|
|
||
| expect(result.text as string).toMatch(/(1 + 1 (equals|is|=) )?2.? ?/); | ||
| // With includeThoughts: true, response may have multiple parts (reasoning + text) | ||
| // Thought signatures are best-effort per Google docs; only assert when |
There was a problem hiding this comment.
Citation? (It seems like we run the risk of a false failure if we only test for thought signatures when we have thought signatures.)
There was a problem hiding this comment.
will check it again
There was a problem hiding this comment.
I'm not sure I follow this comment. Could you rephrase it please?
| role = "function"; | ||
| // Tool messages in Gemini are represented as function responses. | ||
| // Recent Gemini API versions require 'user' or 'model' role for all contents. | ||
| role = "user"; |
There was a problem hiding this comment.
ok, will do
| ); | ||
| parts.push({ | ||
| functionResponse: { | ||
| id: message.tool_call_id, |
There was a problem hiding this comment.
There are cases where there is an ID that needs to be returned. This should not just be removed.
See #10292
There was a problem hiding this comment.
dozens of integration tests fail if I don't remove it. Will check it again
There was a problem hiding this comment.
RequestError: Invalid JSON payload received. Unknown name "id" at 'contents[2].parts[1].function_response': Cannot find field.
There was a problem hiding this comment.
there is no id for Vertex AI https://docs.cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1/Content#FunctionResponse
and for Google AI it's optional and kinda useless https://ai.google.dev/api/caching?authuser=1&hl=pl#FunctionResponse
we have GooglePlatformType = "gai" | "gcp" so we could conditionally include id. But to do so, we'd need to thread platformType through the entire converter call chain - a fairly invasive change for a field that:
- doesn't exist on Vertex AI at all
- gets rejected by Gemini 3 preview models even on Google AI
The pragmatic call for now: just remove id. It's not usable on either platform in practice.
There was a problem hiding this comment.
We had use cases where ID was sent and, since it wasn't returned, it was rejected. I, honestly, have no idea what the situations were.
This is handled in #10292.
There was a problem hiding this comment.
so then we might need the conditional way of handling it
There was a problem hiding this comment.
imho we should skip it for now since all tests are good without this id param - unless we know the exact case when it's required
|
Allen Firstenberg (@afirstenberg) thanks for you detailed review! Really appreciated 💪 |
that PR was first fixing the "function -> user" change - but I don't see that change there now, strange... I must have mixed something up. Will add the change here in this PR |
When a ToolMessage contains multimodal content (text + images via image_url blocks), the converter serializes inlineData/fileData inside functionResponse.response.result as JSON text. Gemini cannot interpret images nested inside a JSON string - they become invisible to the model.
This PR fixes both the legacy and v1 converter paths to:
Also adds @langchain/google to the CI unit test matrix - it was missing from the PACKAGES list in unit-tests-integrations.yml, so its tests were never run in CI.
#10415 - this PR must be merged first - it has some tests fixed that previously couldn't even get run in CI/CD - UPDATE: this is not valid any longer
Fixes #10297