langchain/@google-genai supports Files API and Video Understanding #8234

reedchan7 · 2025-05-22T06:46:55Z

reedchan7
May 22, 2025

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

const model = new ChatGoogleGenerativeAI({
  modelName: "gemini-2.5-pro-preview-05-06"
});
const response = await model.files.list({ config: { pageSize: 10 } });
for await (const file of files) {
  console.log(file.name);
}

Like the code above, now there is no 'model.files' property.
And here is the code example of Video understanding:

const userPrompt = await renderPrompt(convertCodeAndVideoToStepDescTpl, { code });
const promptContent = createUserContent([createPartFromUri(fileInfo.uri!, fileInfo.mimeType!), userPrompt]);
generateContent({
        model,
        contents: promptContent,
        config: {
          systemInstruction,
          responseMimeType: "application/json",
          responseSchema: {
            type: Type.OBJECT,
            properties: {
              steps: {
                type: Type.ARRAY,
                items: { type: Type.STRING }
              }
            }
          }
        }
      });

Description

Hello, I'm using your @langchain/google-genai library, but I noticed it only seems to support text content generation. It doesn't appear to support Google GenAI's Files API or Video Understanding capabilities. I'm unable to upload a video.

May I ask if this functionality is not yet supported? How can I use it?

System Info

langchain info

$ pnpm info langchain                                                                                                                                                                                                                                          [13:53:10]

[email protected] | MIT | deps: 12 | versions: 322
Typescript bindings for langchain
https://github.com/langchain-ai/langchainjs/tree/main/langchain/

keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores

dist
.tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.27.tgz
.shasum: e0ed686765dba3809f90a214110bee445cf240a8
.integrity: sha512-XfOuXetMSpkS11Mt6YJkDmvuSGTMPUsks5DJz4RCZ3y2dcbLkOe5kecjx2SWVJYqQIqcMMwsjsve3/ZjnRe7rQ==
.unpackedSize: 2.9 MB

dependencies:
@langchain/openai: >=0.1.0 <0.6.0        js-tiktoken: ^1.0.12                     jsonpointer: ^5.0.1                      openapi-types: ^12.1.3                   uuid: ^10.0.0                            zod-to-json-schema: ^3.22.3              
@langchain/textsplitters: >=0.0.0 <0.2.0 js-yaml: ^4.1.0                          langsmith: ^0.3.29                       p-retry: 4                               yaml: ^2.2.1                             zod: ^3.22.4                             

maintainers:
- nfcampos <[email protected]>
- jacoblee93 <[email protected]>
- andrewnguonly <[email protected]>
- benjamincburns <[email protected]>
- davidduong <[email protected]>
- hwchase17 <[email protected]>
- basproul <[email protected]>

dist-tags:
latest: 0.3.27                             next: 0.3.2-rc.0                           tag-for-publishing-older-releases: 0.2.20  

published an hour ago by benjamincburns <[email protected]>

platform

linux x86

Node version

v22.15.0

pnpm version

9.15.5

Answered by dosubot[bot]

May 22, 2025

Based on the latest documentation and code, the @langchain/google-genai library currently supports text and image (multimodal) inputs via Gemini models, such as gemini-pro and gemini-pro-vision. You can provide images as part of a single human message, and there are examples for this in both the docs and the README.

For video understanding, there are examples in the LangChain JS Docs showing how to process video files (e.g., MP4) by converting them to base64 and sending them to the Gemini API for structured output. However, these examples use the @langchain/google-vertexai package, not @langchain/google-genai directly. See the media use case documentation for details.

Regarding the Files…

View full answer

@dosu-bot · 2025-05-22T06:50:44Z

dosubot[bot]
bot May 22, 2025

Based on the latest documentation and code, the @langchain/google-genai library currently supports text and image (multimodal) inputs via Gemini models, such as gemini-pro and gemini-pro-vision. You can provide images as part of a single human message, and there are examples for this in both the docs and the README.

For video understanding, there are examples in the LangChain JS Docs showing how to process video files (e.g., MP4) by converting them to base64 and sending them to the Gemini API for structured output. However, these examples use the @langchain/google-vertexai package, not @langchain/google-genai directly. See the media use case documentation for details.

Regarding the Files API (uploading and managing files via model.files or similar), there is no evidence in the documentation or codebase that this is currently supported in @langchain/google-genai. There are some experimental type definitions related to media files in the codebase, but no concrete implementation or public API for file management or upload was found, and the model.files property does not exist in the current version example.

Summary:

Video understanding: Supported via base64-encoded video input, but examples use VertexAI, not Google GenAI directly (docs).
Files API: Not currently supported in @langchain/google-genai; no public API for file upload or management is available.
Image input: Supported via Gemini vision models (docs).

If you need to process videos, consider using the base64 approach as shown in the docs, but direct file upload and management via the Files API is not available at this time. If you have a specific workflow in mind, let me know and I can help suggest alternatives or workarounds.

To continue talking to Dosu, mention @dosu-bot.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

langchain/@google-genai supports Files API and Video Understanding #8234

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

langchain/@google-genai supports Files API and Video Understanding #8234

Uh oh!

reedchan7 May 22, 2025

Checked other resources

Commit to Help

Example Code

Description

System Info

langchain info

platform

Node version

pnpm version

Replies: 1 comment

Uh oh!

dosubot[bot] bot May 22, 2025

reedchan7
May 22, 2025

dosubot[bot]
bot May 22, 2025