Skip to content

Commit 6bdd39b

Browse files
marmor7DouweM
andauthored
Update docs on multi-modal file URLs being downloaded or sent directly (#3492)
Co-authored-by: Douwe Maan <[email protected]>
1 parent 9106868 commit 6bdd39b

File tree

1 file changed

+27
-10
lines changed

1 file changed

+27
-10
lines changed

docs/input.md

Lines changed: 27 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -104,20 +104,37 @@ print(result.output)
104104

105105
## User-side download vs. direct file URL
106106

107-
As a general rule, when you provide a URL using any of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI downloads the file content and then sends it as part of the API request.
107+
When you provide a URL using any of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI will typically send the URL directly to the model API so that the download happens on their side.
108108

109-
The situation is different for certain models:
109+
Some model APIs do no support file URLs at all or for specific file types. In the following cases, Pydantic AI will download the file content and send it as part of the API request instead:
110110

111-
- [`AnthropicModel`][pydantic_ai.models.anthropic.AnthropicModel]: if you provide a PDF document via `DocumentUrl`, the URL is sent directly in the API request, so no download happens on the user side.
111+
- [`BedrockModel`][pydantic_ai.models.bedrock.BedrockModel]: All URLs
112+
- [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel]: All URLs
113+
- [`AnthropicModel`][pydantic_ai.models.anthropic.AnthropicModel]: `DocumentUrl` with media type `text/plain`
114+
- [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel]: `AudioUrl` and `DocumentUrl`
115+
- [`GoogleModel`][pydantic_ai.models.google.GoogleModel] using GLA (Gemini Developer API): All URLs except YouTube video URLs and files uploaded to the [Files API](https://ai.google.dev/gemini-api/docs/files).
112116

113-
- [`GoogleModel`][pydantic_ai.models.google.GoogleModel] on Vertex AI: any URL provided using `ImageUrl`, `AudioUrl`, `VideoUrl`, or `DocumentUrl` is sent as-is in the API request and no data is downloaded beforehand.
117+
If the model API supports file URLs but may not be able to download a file because of crawling or access restrictions, you can instruct Pydantic AI to download the file content and send that instead of the URL by enabling the `force_download` flag on the URL object. For example, [`GoogleModel`][pydantic_ai.models.google.GoogleModel] on Vertex AI limits YouTube video URLs to one URL per request.
114118

115-
See the [Gemini API docs for Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#filedata) to learn more about supported URLs, formats and limitations:
119+
## Uploaded Files
116120

117-
- Cloud Storage bucket URIs (with protocol `gs://`)
118-
- Public HTTP(S) URLs
119-
- Public YouTube video URL (maximum one URL per request)
121+
Some model providers like Google's Gemini API support [uploading files](https://ai.google.dev/gemini-api/docs/files). You can upload a file to the model API using the client you can get from the provider and use the resulting URL as input:
120122

121-
However, because of crawling restrictions, it may happen that Gemini can't access certain URLs. In that case, you can instruct Pydantic AI to download the file content and send that instead of the URL by setting the boolean flag `force_download` to `True`. This attribute is available on all objects that inherit from [`FileUrl`][pydantic_ai.messages.FileUrl].
123+
```py {title="file_upload.py" test="skip"}
124+
from pydantic_ai import Agent, DocumentUrl
125+
from pydantic_ai.models.google import GoogleModel
126+
from pydantic_ai.providers.google import GoogleProvider
127+
128+
provider = GoogleProvider()
129+
file = provider.client.files.upload(file='pydantic-ai-logo.png')
130+
assert file.uri is not None
122131

123-
- [`GoogleModel`][pydantic_ai.models.google.GoogleModel] on GLA: YouTube video URLs are sent directly in the request to the model.
132+
agent = Agent(GoogleModel('gemini-2.5-flash', provider=provider))
133+
result = agent.run_sync(
134+
[
135+
'What company is this logo from?',
136+
DocumentUrl(url=file.uri, media_type=file.mime_type),
137+
]
138+
)
139+
print(result.output)
140+
```

0 commit comments

Comments
 (0)