feat: add image upload support with compression and increase body limit by Hieuslecong · Pull Request #39 · ntthanh2603/gemini-web-to-api

Hieuslecong · 2026-03-12T17:23:36Z

Description

This PR introduces full support for Vision (Image Upload) capabilities. Users can now send image data along with text prompts to the Gemini model through the API.

Key Changes

Core Vision Logic: Implemented the file upload handshake with Google's content-push service, enabling the use of image identifiers in generation requests.
Improved Data Handling: Updated the service layer to decode and process base64 image data from API requests.
Server Optimization: Adjusted the maximum request body limit to support larger media payloads.
Enhanced Client Tools:
- Added new demonstration scripts for image-based chat.
- Integrated automatic image compression and resizing in client scripts to optimize bandwidth and prevent payload size errors.

Testing

Verified successful image recognition and description by the AI model.
Confirmed that large, high-resolution images are correctly handled via client-side processing before transmission.

Summary: Vision support, payload optimization, and client-side image processing.

gemini-code-assist · 2026-03-12T17:29:53Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Bản cập nhật này giới thiệu khả năng Vision (tải ảnh) đầy đủ, cho phép người dùng gửi dữ liệu hình ảnh cùng với các lời nhắc văn bản đến mô hình Gemini thông qua API. Nó bao gồm các thay đổi ở cả phía máy chủ để xử lý và tải ảnh lên dịch vụ của Google, cũng như các cải tiến ở phía client để nén và tối ưu hóa hình ảnh trước khi gửi.

Highlights

Hỗ trợ Tải ảnh (Vision): Đã triển khai logic cốt lõi để hỗ trợ tải ảnh lên, bao gồm quá trình bắt tay với dịch vụ content-push của Google và cho phép sử dụng định danh ảnh trong các yêu cầu tạo nội dung.
Xử lý dữ liệu ảnh: Cập nhật lớp dịch vụ để giải mã và xử lý dữ liệu ảnh base64 từ các yêu cầu API.
Tối ưu hóa máy chủ: Điều chỉnh giới hạn kích thước body yêu cầu tối đa của máy chủ lên 20MB để hỗ trợ tải trọng media lớn hơn.
Công cụ client nâng cao: Thêm các script demo mới cho chat dựa trên hình ảnh, tích hợp tính năng nén và thay đổi kích thước ảnh tự động ở phía client để tối ưu hóa băng thông và tránh lỗi kích thước payload.

Changelog

demo_ask_image.py
- Added a new Python script demo_ask_image.py to demonstrate image-based chat with the Gemini API.
- Included encode_image function for image compression and Base64 encoding using Pillow.
- Implemented logic to construct and send API requests with both text prompts and inline image data.
demo_upload.py
- Added a new Python script demo_upload.py for demonstrating image upload and description using the Gemini API.
- Incorporated the encode_image function for client-side image processing.
- Structured the payload to send an image with a default descriptive prompt.
internal/modules/gemini/gemini_service.go
- Imported encoding/base64 package.
- Modified GenerateContent to parse InlineData from request parts, decode base64 image data, and collect FileData.
- Updated the content generation logic to pass collected FileData to the provider client.
- Adjusted the empty content check to also consider the presence of files.
internal/modules/providers/gemini_service.go
- Imported bytes package.
- Added EndpointUpload constant for Google's content-push service.
- Implemented UploadFile function to handle uploading FileData to the content-push endpoint and return the file identifier.
- Modified GenerateContent to iterate through config.Files, upload each file using UploadFile, and construct the request payload with the returned file URLs.
- Adjusted the messageContent structure to accommodate image data in the request.
internal/modules/providers/provider_interface.go
- Defined a new struct FileData with fields MimeType, Data (byte slice), and FileName.
- Changed the Files field in GenerateConfig from []string to []FileData.
- Updated the WithFiles option function to accept []FileData.
internal/server/server.go
- Increased the BodyLimit for the Fiber application to 20 * 1024 * 1024 bytes (20 MB) to support larger request payloads, specifically for image uploads.

Activity

No human activity has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for image uploads (Vision) to the Gemini service. The changes are comprehensive, including updates to the service layer to handle file data, a new provider-level file upload mechanism, and an increased server body limit. Two new Python demo scripts are also included to showcase the new functionality. My review has identified a few issues: a bug in demo_upload.py where an incorrect MIME type is used, a more significant bug in internal/modules/providers/gemini_service.go related to duplicated and inconsistent filename generation for uploads, and a minor maintainability issue with a magic string in the same file. Overall, the implementation is solid, but the identified issues, especially the filename generation bug, should be addressed to ensure robustness.

internal/modules/providers/gemini_service.go

demo_upload.py

internal/modules/providers/gemini_service.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

feat: add image upload support with compression and increase body limit

35802f4

gemini-code-assist bot reviewed Mar 12, 2026

View reviewed changes

internal/modules/providers/gemini_service.go Outdated Show resolved Hide resolved

demo_upload.py Outdated Show resolved Hide resolved

internal/modules/providers/gemini_service.go Show resolved Hide resolved

ntthanh2603 mentioned this pull request Mar 13, 2026

Tôi không thấy gửi kèm ảnh và tài liệu #38

Open

Hieuslecong and others added 2 commits March 14, 2026 00:01

Update demo_upload.py

9e62089

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update internal/modules/providers/gemini_service.go

bed1695

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

ntthanh2603 self-requested a review March 14, 2026 07:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add image upload support with compression and increase body limit#39

feat: add image upload support with compression and increase body limit#39
Hieuslecong wants to merge 3 commits intontthanh2603:mainfrom
Hieuslecong:feature/gemini-vision-support

Hieuslecong commented Mar 12, 2026

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hieuslecong commented Mar 12, 2026

Description

Key Changes

Testing

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant