Skip to content

feat: multimodal image support with drag-and-drop file handling#1025

Open
0xsline wants to merge 4 commits intoItzCrazyKns:masterfrom
0xsline:feat/multimodal-image-support
Open

feat: multimodal image support with drag-and-drop file handling#1025
0xsline wants to merge 4 commits intoItzCrazyKns:masterfrom
0xsline:feat/multimodal-image-support

Conversation

@0xsline
Copy link

@0xsline 0xsline commented Mar 5, 2026

Summary

  • Multimodal AI conversations: Upload images to send to AI models (OpenAI, Anthropic, Gemini, Ollama) for visual understanding
  • Paste & drag-and-drop: Support Ctrl+V/Cmd+V paste and drag-and-drop for both images and documents (PDF/DOCX/TXT) in chat inputs
  • Chat history images: Display uploaded images as thumbnails in chat history with click-to-preview lightbox
  • File management: Per-file delete buttons for both images and documents in the attachment panel, with upload progress indicator

Changes

Backend (LLM providers)

  • Add ContentPart types (text / image_url) to src/lib/types.ts
  • Handle multimodal ContentPart[] in OpenAI and Ollama message converters
  • Pass images through search agent pipeline to final LLM call

Frontend

  • useFileHandler hook: shared paste + drag-and-drop logic with ref-based state to avoid stale closures
  • ImagePreview component: click-to-enlarge lightbox overlay
  • MessageInput / EmptyChatMessageInput: paste, drag-and-drop, upload progress toast
  • Attach / AttachSmall: per-file X delete buttons for documents
  • MessageBox: render image thumbnails in chat history
  • useChat: track images state, include in message payload

Test plan

  • Upload image via attach button → sends to AI, AI responds about image content
  • Paste screenshot (Cmd+V) into chat input → image appears in attachments
  • Drag image file into chat input → border highlights, image added
  • Drag PDF into chat input → upload progress shown, file appears in attachments
  • Click X on individual image/file → removed from attachments
  • Sent images visible in chat history, click to open fullscreen preview
  • Works with OpenAI, Ollama (multimodal models)

Summary by cubic

Add image attachments to chat with paste and drag-and-drop, thumbnails in history, and click-to-preview. Improves file handling reliability, forwards images through search, and fixes PDF export (CJK) and container startup.

  • New Features

    • Send multimodal messages (text + images) end-to-end via ContentPart; images flow through search to the final model.
    • Paste and drag-and-drop for images and documents, with drag overlay and upload progress.
    • Updated Attach UI and shared useFileHandler; thumbnails in history with click-to-preview; per-file delete for images and documents.
  • Bug Fixes

    • Search route now forwards images instead of dropping them.
    • Hardened file handling: try/catch/finally, FileReader error/abort, functional state updates to safely merge async results; clear images on chat switch.
    • PDF export: add NotoSansSC lazy loader for CJK; Docker: add typing_extensions and set HOSTNAME=0.0.0.0.

Written for commit dd386df. Summary will update on new commits.

0xsline and others added 2 commits March 5, 2026 05:42
- Add NotoSansSC font for Chinese/Japanese/Korean PDF export support
- Create lazy-loading font loader with caching for optimal performance
- Fix SearXNG build failure by adding typing_extensions dependency
- Fix container startup by setting HOSTNAME=0.0.0.0 in docker-compose

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add multimodal image upload for AI conversations (OpenAI, Ollama)
- Support image paste (Ctrl+V/Cmd+V) and drag-and-drop in chat inputs
- Support document file (PDF/DOCX/TXT) paste and drag-and-drop upload
- Display uploaded images in chat history with click-to-preview lightbox
- Add per-file delete buttons for both images and documents in attachments
- Show upload progress indicator for document file uploads
- Extract shared file handling logic into useFileHandler hook
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 issues found across 23 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/components/MessageInputActions/Attach.tsx">

<violation number="1" location="src/components/MessageInputActions/Attach.tsx:32">
P2: handleFileChange sets loading to true but lacks error handling; if fetch/res.json throws, setLoading(false) never runs and the attach UI can get stuck on the spinner.</violation>

<violation number="2" location="src/components/MessageInputActions/Attach.tsx:84">
P2: Async image selection appends with captured `images`, which can drop previous selections. Use a functional state update.</violation>
</file>

<file name="src/lib/hooks/useChat.tsx">

<violation number="1" location="src/lib/hooks/useChat.tsx:750">
P2: Images state isn’t cleared when switching chats, so unsent images from a previous chat can be attached to a different chat’s next message.</violation>
</file>

<file name="src/lib/hooks/useFileHandler.ts">

<violation number="1" location="src/lib/hooks/useFileHandler.ts:47">
P2: FileReader errors/aborts are not handled; a failed read leaves the per-file promise pending so Promise.all never resolves, stalling the entire image batch.</violation>

<violation number="2" location="src/lib/hooks/useFileHandler.ts:58">
P2: Async callbacks append to state using ref snapshots rather than functional updates, so multiple completions before a render can overwrite each other and drop images/files. Use functional `set*` updates to make merges atomic.</violation>
</file>

<file name="src/app/api/search/route.ts">

<violation number="1" location="src/app/api/search/route.ts:63">
P1: Search route drops uploaded image context by forcing `images` to an empty array instead of forwarding request images.</violation>
</file>

<file name="src/components/MessageInputActions/AttachSmall.tsx">

<violation number="1" location="src/components/MessageInputActions/AttachSmall.tsx:81">
P2: Async image uploads append using a stale `images` array, so overlapping selections can overwrite earlier results. Use a functional state update to preserve concurrent additions.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- P1: forward images in search route instead of dropping them
- P2: add try/catch/finally to file upload handlers (Attach, AttachSmall)
- P2: use refs for latest state in async callbacks to prevent stale closures
- P2: add FileReader onerror/onabort handlers to prevent pending promises
- P2: clear images state when switching chats to prevent cross-chat leaks
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/components/MessageInputActions/Attach.tsx">

<violation number="1" location="src/components/MessageInputActions/Attach.tsx:64">
P2: Async uploads/readers use ref snapshots when updating files/fileIds/images, so near-simultaneous completions can overwrite each other and drop attachments. Use functional state updates to merge against the latest value.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- Change context types for setFiles/setFileIds/setImages to
  Dispatch<SetStateAction<T>> to support functional updates
- Replace all ref-based state reads with prev => [...prev, ...new]
  pattern in async callbacks (useFileHandler, Attach, AttachSmall)
- Eliminates risk of concurrent async completions overwriting each other
Copy link

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review pass:

  • Main risk area here is input validation, path handling, and malformed payload behavior.
  • I didn’t see targeted regression coverage in the diff; please add or point CI at a focused test for the changed path in Dockerfile, docker-compose.yaml, NotoSansSC-Regular.ttf (+20 more).
  • Before merge, I’d smoke-test the behavior touched by Dockerfile, docker-compose.yaml, NotoSansSC-Regular.ttf (+20 more) with malformed input / retry / rollback cases, since that’s where this class of change usually breaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants