-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add basic support for UploadedFile UserContent #2611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d6f2bb3 to
4fd4a88
Compare
This wraps an opaque reference to a provider-specific representation of an uploaded file.
2917ac8 to
8a837b5
Compare
8a837b5 to
8140f52
Compare
8140f52 to
0d6e486
Compare
|
@DouweM CI still failing on code coverage. I will fix it, but first I'd love some feedback on the API . LMK if you agree with the choices or if I should make some adjustments! |
DouweM
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarruda I just noticed I never submitted the review I did many weeks ago 🤦🏻
| provider = OpenAIProvider(api_key=openai_api_key) | ||
| m = OpenAIModel('gpt-4o', provider=provider) | ||
| # VCR recording breaks when dealing with openai file upload request due to | ||
| # binary contents. For that reason, we have manually run once the upload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binary appears to be supported: https://github.com/kevin1024/vcrpy/blob/d50f3385a6828280def801ac7f544fe04a37e39c/tests/unit/test_json_serializer.py#L7
Can you share the error you were seeing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I get when I uncomment the code to upload on the google test (with vcr enabled):
| AssertionError
+---------------- 2 ----------------
| Traceback (most recent call last):
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2266, in run_test
| self.get_loop().run_until_complete(
| File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
| return future.result()
| ^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2226, in _call_in_runner_task
| return await future
| ^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2193, in _run_tests_and_fixtures
| retval = await coro
| ^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/tests/models/test_google.py", line 2787, in test_uploaded_file_input
| google_file = client.files.upload(
| ^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/google/genai/files.py", line 484, in upload
| return_file = self._api_client.upload_file(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/google/genai/_api_client.py", line 1438, in upload_file
| return self._upload_fd(
| ^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/google/genai/_api_client.py", line 1500, in _upload_fd
| response = self._httpx_client.request(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/httpx/_client.py", line 825, in request
| return self.send(request, auth=auth, follow_redirects=follow_redirects)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
| response = self._send_handling_auth(
| ^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
| response = self._send_handling_redirects(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
| response = self._send_single_request(request)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/vcr/stubs/httpx_stubs.py", line 200, in _inner_send
| return _sync_vcr_send(cassette, real_send, *args, **kwargs)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/vcr/stubs/httpx_stubs.py", line 186, in _sync_vcr_send
| vcr_request, response = _shared_vcr_send(cassette, real_send, *args, **kwargs)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/vcr/stubs/httpx_stubs.py", line 117, in _shared_vcr_send
| vcr_request = _make_vcr_request(real_request, **kwargs)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/home/thiago/code/pydantic-ai/.venv/lib/python3.12/site-packages/vcr/stubs/httpx_stubs.py", line 108, in _make_vcr_request
| body = httpx_request.read().decode("utf-8")
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9c in position 72: invalid start byte
|
This PR is stale, and will be closed in 3 days if no reply is received. |
|
Closing this PR as it has been inactive for 10 days. |
|
@tarruda Thank you, reopened! |
- map UploadedFile to provider-friendly structures: Google uses file_data parts; OpenAI accepts file IDs or objects with ids - document provider expectations for UploadedFile in code and input docs - add tests and cassette adjustments to cover file ID/URI handling for OpenAI and Google
- add an `UploadedFilePart` schema and emit uploaded-file metadata in OTEL user prompt parts, including file references when allowed - derive stable identifiers for `UploadedFile` objects with optional overrides for clearer telemetry - silence the pyright private-usage warning in the Google uploaded file test
|
@DouweM I merged main (lots of conflicts 😓) Also addressed your review! |
| #> The document discusses... | ||
| ``` | ||
|
|
||
| ## Uploaded files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just merged #3492 which (among other things) added an Uploaded Files section as well :)
Can you merge main and update that example to use the UploadedFile object? Keeping the section above the "user-side ..." section makes sense to me.
|
|
||
| ## Uploaded files | ||
|
|
||
| Use [`UploadedFile`][pydantic_ai.UploadedFile] when you've already uploaded content to the model provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to the above, let's include examples of how to do that for all providers
|
|
||
| - [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel] and [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel] accept an `openai.types.FileObject` or a file ID string returned by the OpenAI Files API. | ||
| - [`GoogleModel`][pydantic_ai.models.google.GoogleModel] accepts a `google.genai.types.File` or a file URI string from the Gemini Files API. | ||
| - Other models currently raise `NotImplementedError` when they receive an `UploadedFile`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's support Anthropic as well: https://platform.claude.com/docs/en/build-with-claude/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anthropic provide a client-side SDK for this? In the link I only see it being done with http requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarruda It's not super discoverable, but all of those code samples have a "Shell" dropdown that also has a "Python" option. So yes there's an SDK for uploading files, and their objects for passing file URLs and binary data also have a file_id field that maps to the ID returned by the file upload SDK.
| - [`GoogleModel`][pydantic_ai.models.google.GoogleModel] accepts a `google.genai.types.File` or a file URI string from the Gemini Files API. | ||
| - Other models currently raise `NotImplementedError` when they receive an `UploadedFile`. | ||
|
|
||
| ```py {title="uploaded_file_input.py" test="skip" lint="skip"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't skip linting
| result = agent.run_sync( | ||
| [ | ||
| 'Give me a short description of this image', | ||
| UploadedFile(file='file-abc123'), # file-abc123 is a file ID returned by the provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update the example to be more "real"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant that we can actually show the code for uploading a file using the provider SDK, and then passing in the return object/ID here instead of a fake ID
| if isinstance(file, File): | ||
| file_uri = file.uri | ||
| mime_type = file.mime_type | ||
| display_name = getattr(file, 'display_name', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why getattr instead of a regular attr read?
| ) | ||
|
|
||
|
|
||
| def _map_uploaded_file(uploaded_file: UploadedFile, _provider: Provider[Any]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like we need the provider?
| if isinstance(file, FileObject): | ||
| return file.id | ||
|
|
||
| file_id = getattr(file, 'id', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to support arbitrary objects with an id; rather just the types allowed on the future OpenAIUploadedFile: str and FileObject
| async def test_uploaded_file_input(allow_model_requests: None, google_provider: GoogleProvider): | ||
| m = GoogleModel('gemini-2.5-flash', provider=google_provider) | ||
| # VCR recording breaks when dealing with openai file upload request due to | ||
| # binary contents. For that reason, we have manually run once the upload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try if this has been fixed? I think we have some VCRs containing binary files already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless it has been fixed in the latest main merge today, I'm certain the bug is still present. I tried after fixing the conflicts.
This is very easy to reproduce locally:
- Uncomment the block which uploads the file
- Run:
uv run pytest tests/models/test_google.py -k uploaded_file_input --record-mode=all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move to test_messages for consistency
|
This PR is stale, and will be closed in 3 days if no reply is received. |
@DouweM here's some initial support for #2574
The
UploadedFileuser content is simply wrapping an opaque reference to some return value of provider-specific file upload API, which is then validated in the corresponding model_map_user_prompt.I've only added support for Google and OpenAI, but I believe the API should be flexible enough to add support for other providers. I started working on Anthropic, but decided to leave it out for now as the official SDK doesn't support this feature yet, and I was having trouble referencing it using the SDK data objects.
I've opted to not implement a
Provider.upload_fileabstraction, as the options can be different across providers and I would need to get more familiar with pydantic-ai before feeling confident enough to design a proper abstraction (Can follow up with another PR later!)One caveat with the tests: The VCR framework apparently doesn't support requests containing binary content, so I had to turn off for uploading files. This is how I proceeded to add the tests:
Since this is just a recording and we are only verifying that we can reference an uploaded file, it probably doesn't matter much that we are not actually running the upload request for now. This can be changed later when VCR is fixed to support this type of request.
Close #2574