Skip to content

feat: native PDF support for OpenAI-compatible endpoints#2190

Open
tessaherself wants to merge 3 commits intohuggingface:mainfrom
tessaherself:feat/openai-native-pdf
Open

feat: native PDF support for OpenAI-compatible endpoints#2190
tessaherself wants to merge 3 commits intohuggingface:mainfrom
tessaherself:feat/openai-native-pdf

Conversation

@tessaherself
Copy link

Summary

  • When a model declares acceptedFileMimetypes: ["application/pdf"], PDFs are sent as native OpenAI file content parts instead of XML-wrapped base64 text
  • Uses the OpenAI SDK's ChatCompletionContentPart.File type with file_data (base64 data URI)
  • Also wired through the MCP flow so tool-calling conversations get the same native PDF handling
  • Non-PDF files and models without acceptedFileMimetypes are completely unaffected

Builds on #2189 (adds the acceptedFileMimetypes field to model config)

Example config:

{
  "name": "gpt-4o",
  "multimodal": true,
  "acceptedFileMimetypes": ["application/pdf"]
}

Related: #2188

Test plan

  • Upload a PDF with a model that has acceptedFileMimetypes: ["application/pdf"] — verify the API request uses type: "file" content parts
  • Upload a PDF with a model that does NOT have acceptedFileMimetypes — verify it falls through (no file content part, no crash)
  • Upload a text file — verify existing XML document injection still works
  • Upload an image — verify existing image_url handling still works

🤖 Generated with Claude Code

tessaherself and others added 2 commits March 16, 2026 21:53
Allow models to declare which file MIME types they accept (e.g.
"application/pdf", "image/*") via a new `acceptedFileMimetypes` field.
The frontend merges these with existing multimodal MIME types to
determine which upload options to show.

This enables per-model file type support without coupling to a
specific provider's file handling implementation.

Refs: huggingface#482, huggingface#609, huggingface#1505, huggingface#1652

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a model declares `acceptedFileMimetypes` including
"application/pdf", PDFs are now sent as OpenAI `file` content parts
with base64 data instead of being wrapped in XML tags. This lets
OpenAI (and compatible providers) process PDFs natively.

Non-PDF files and models without `acceptedFileMimetypes` are
unaffected — existing text extraction behavior is preserved.

Depends on: huggingface#2189

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee696c19ce

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

imageProcessor,
mmEnabled
mmEnabled,
model.acceptedFileMimetypes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use target model file allowlist in MCP message prep

After resolveRouterTarget picks targetModel, the MCP request is built against that routed model, but prepareMessagesWithFiles still receives model.acceptedFileMimetypes from the original model. In routed conversations where these differ, file parts are prepared with the wrong MIME policy: PDFs can be omitted when the routed model supports them, or sent when the routed model does not, which can produce incorrect behavior or upstream 4xx errors during tool-calling flows.

Useful? React with 👍 / 👎.

After router resolution picks a target model, use its file MIME policy
instead of the original model's when preparing file content parts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant