Skip to content

Improve attachment file type recognision#7397

Open
CrafterFB wants to merge 8 commits intooobabooga:devfrom
CrafterFB:improve-attachment-file-type-recognision
Open

Improve attachment file type recognision#7397
CrafterFB wants to merge 8 commits intooobabooga:devfrom
CrafterFB:improve-attachment-file-type-recognision

Conversation

@CrafterFB
Copy link
Copy Markdown

Checklist:

Background

While testing vision for my install i attached a JPEG without extension (so image instead of image.jpg).
The AI stated that it didn't see an image.

New Behavior

File type recognision

The handling of attachments is now based on mime types.
It first tries to infer the type based on the file path (file name) using the python standard library mimetypes.
If it returned None, filetype is used to infer the type based on the magic number (file content).
The branching in chat.py now is based on the found mime type.

New supported file types

Since all image attachments are wrapped in an Image object using pillow, the bottlenecks for image type support are pillow and the detection system.

The new supported types are whitelisted in the is_mime_type_vision_supported in image_utils.py.

The new supported types include all previously supported types, tiff and avif as well as other lesser knows types (see is_mime_type_vision_supported for the full list.

All new mime types are for file types supported by pillow and are (semi) official i.e. are listed in /usr/share/mime or on the mime type registry.

Other changes

Any place that the type of attachments was checked now uses is_mime_type_vision_supported to test for image attachments.

The <img /> element of the attachments in the ui now includes the detected mime type.

New Dependency: filetype

The library filetype (pypi) was chosen because it is portable (pure python implementation).

All requirements.txt files have been updated with the new dependency.

python-magic was also considered but since windows doesn't ship with libmagic and the version with binaries included isn't available for modern python versions, it was replaced with filetype.

Version without filetype

Commit 4ceb426 has most of the new changes except the fallback if the mime type couldn't be found based on the path.

Testing

The testing was done on fedora kde 43 with the cpu-only build and Qwen3VL-2B-Instruct-Q8_0 with mmproj-Qwen3VL-2B-Instruct-Q8_0.
The AI was asked "what is in this file" and the file was attached.

I used the same easy to recognize base png image converted to various file types.
If the AI correctly identified what was in the image it was a success.
The following file types where tested:

  • avif
  • bmp
  • fits
  • gif
  • jp2 (JPEG 2000)
  • jpg
  • pcd
  • png
  • qoi
  • tiff
  • webp
  • wmf

Those include all previously supported types, the two i saw as reasonably important and a random selection of the unusual types.

It worked correctly with most types on the first try. Although:

  • fits had only one channel of rgb so the AI saw it as black and white (either wrong conversion or a problem with pillow)
  • qui and pcd where identified as application/octet-stream (raw binary) and thus did't work. See the next Section on why they are still included in the white list.

Other Tests (all successfull)

  • pdf document
  • docx document (mime type: application/vnd.openxmlformats-officedocument.wordprocessingml.document)
  • png without file extension
  • utf-8 text (a json file)
  • multiple attachments at once
  • attachment history (attached to a message where i told the AI to ignore the file, rebooted the server, asked the AI about the content in a follow up message)
  • png without file extension on windows

Tested Failure cases

  • supported mime type is not listed
    fallback to previous behavior, the file is attempted to be opened with utf-8 and often fails, an error is printed in the server console and the file is not attached.
    fix: add type to the whitelist
  • mime type is supported, but not detected
    fallback to previous behavior, the file is attempted to be opened with utf-8 and often fails, an error is printed in the server console and the file is not attached.
  • non supported mime type is listed / wrong identification of a file / invalid format
    the file is attempted to be opened with pillow and fails, an error is printed in the server console and the file is not attached. The error is a different one than previously, but the result is the same.

The changes made are a strict improvement functionality wise with all failure cases falling back to behavior that is similar to the behavior before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant