feat: native PDF support for OpenAI-compatible endpoints#2190
feat: native PDF support for OpenAI-compatible endpoints#2190tessaherself wants to merge 3 commits intohuggingface:mainfrom
Conversation
Allow models to declare which file MIME types they accept (e.g. "application/pdf", "image/*") via a new `acceptedFileMimetypes` field. The frontend merges these with existing multimodal MIME types to determine which upload options to show. This enables per-model file type support without coupling to a specific provider's file handling implementation. Refs: huggingface#482, huggingface#609, huggingface#1505, huggingface#1652 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a model declares `acceptedFileMimetypes` including "application/pdf", PDFs are now sent as OpenAI `file` content parts with base64 data instead of being wrapped in XML tags. This lets OpenAI (and compatible providers) process PDFs natively. Non-PDF files and models without `acceptedFileMimetypes` are unaffected — existing text extraction behavior is preserved. Depends on: huggingface#2189 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee696c19ce
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| imageProcessor, | ||
| mmEnabled | ||
| mmEnabled, | ||
| model.acceptedFileMimetypes |
There was a problem hiding this comment.
Use target model file allowlist in MCP message prep
After resolveRouterTarget picks targetModel, the MCP request is built against that routed model, but prepareMessagesWithFiles still receives model.acceptedFileMimetypes from the original model. In routed conversations where these differ, file parts are prepared with the wrong MIME policy: PDFs can be omitted when the routed model supports them, or sent when the routed model does not, which can produce incorrect behavior or upstream 4xx errors during tool-calling flows.
Useful? React with 👍 / 👎.
After router resolution picks a target model, use its file MIME policy instead of the original model's when preparing file content parts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
acceptedFileMimetypes: ["application/pdf"], PDFs are sent as native OpenAIfilecontent parts instead of XML-wrapped base64 textChatCompletionContentPart.Filetype withfile_data(base64 data URI)acceptedFileMimetypesare completely unaffectedBuilds on #2189 (adds the
acceptedFileMimetypesfield to model config)Example config:
{ "name": "gpt-4o", "multimodal": true, "acceptedFileMimetypes": ["application/pdf"] }Related: #2188
Test plan
acceptedFileMimetypes: ["application/pdf"]— verify the API request usestype: "file"content partsacceptedFileMimetypes— verify it falls through (no file content part, no crash)image_urlhandling still works🤖 Generated with Claude Code