Refactor multiple flows to enhance performance, boost scalability, and ensure stability. by luuquangvu · Pull Request #131 · Nativu5/Gemini-FastAPI

luuquangvu · 2026-03-21T04:10:07Z

This PR is still a work in progress and uses features that aren't yet officially available in the Gemini-API library, so we'll need to wait for the library's official update before merging. Feel free to try it out and share any feedback or report any issues you encounter. Thanks!

Here are some highlights of the changes:

The entire logic for storing conversation history has been rewritten, aiming for compatibility with various endpoints and easy scalability in the future.
The logic of the endpoints has been rewritten, and now all endpoints work correctly with both streaming and non-streaming flows.
Compatible with the latest library updates, including the ability to download full-size images and enable video or music generation.
All cookie-related errors will be fully resolved, and users will get a clear notification if the server invalidates cookies, making it simple to know when to manually refresh a new one.

…nd update response handling for consistency

…mat for compatibility

…ck to default size

…e handling

…andling, and improved extension determination

- Introduced `model_strategy` configuration for "append" (default + custom models) or "overwrite" (custom models only). - Enhanced `/v1/models` endpoint to return models based on the configured strategy. - Improved model loading with environment variable overrides and validation. - Refactored model handling logic for improved modularity and error handling.

…eld support - Enhanced `extract_gemini_models_env` to handle nested fields within environment variables. - Updated type hints for more flexibility in model overrides. - Improved `_merge_models_with_env` to better support field-level updates and appending new models.

- Moved utility functions like `strip_code_fence`, `extract_tool_calls`, and `iter_stream_segments` to a centralized helper module. - Removed unused and redundant private methods from `chat.py`, including `_strip_code_fence`, `_strip_tagged_blocks`, and `_strip_system_hints`. - Updated imports and references across modules for consistency. - Simplified tool call and streaming logic by replacing inline implementations with shared helper functions.

- Replaced unused model placeholder in `config.yaml` with an empty list. - Added JSON parsing validators for `model_header` and `models` to enhance flexibility and error handling. - Improved validation to filter out incomplete model configurations.

…N support - Replaced prefix-based parsing with a root key approach. - Added JSON parsing to handle list-based model configurations. - Improved handling of errors and cleanup of environment variables.

…to Python literals - Added `ast.literal_eval` as a fallback for parsing environment variables when JSON decoding fails. - Improved error handling and logging for invalid configurations. - Ensured proper cleanup of environment variables post-parsing.

- Adjusted `TOOL_CALL_RE` regex pattern for better accuracy.

…nvironment variable setup

…nvironment variables; enhance error logging in config validation

…tring or list structure for enhanced flexibility in automated environments

… multiple chunks

…s found in either the raw or cleaned history.

…s found.

… for better Gemini compatibility.

luuquangvu · 2026-03-26T00:43:11Z

Using latest image it seems to not support multi turn conversation am I right? Every message I send it forget the older one

I don't know how you tested it, but it works fine for me. Note that to keep conversations running continuously through restarts, you need to enable Gemini Activities.

Vigno04 · 2026-03-26T08:45:58Z

Gemini activity is enabled, but I think it is using temporary chat (the best option to not saturate my chat history with all the users request) I think it should work regardless the original code worked

luuquangvu · 2026-03-26T13:19:24Z

The library's author is preparing to release a major 2.0 update. Therefore, this PR will also have to wait for it, as some features are being developed based on the latest code from the library.

…ecurity standards.

luuquangvu · 2026-04-07T03:17:14Z

@Nativu5 The library is now at version 2.0. There's currently PR #134 waiting for you to merge it. Would you like to merge it before this PR? This PR has undergone many changes, so I need a stable main branch to effectively manage and resolve all merge conflicts. Thank you!

Vigno04 · 2026-04-14T10:19:12Z

I think one improvement you could make is stripping unnecessary tokens before sending messages to the chat.

For example, when using Open WebUI, I see messages formatted like this looking at them from the gemini web ui:

 <|im_start|>user

aiutami a valutare ...

<|im_end|>

<|im_start|>assistant

As you can see, it includes all the special tags. However, these tags are already reintroduced by Gemini on the backend, since the message is sent as part of a web ui chat request.

Removing them would have a few benefits:

Based on tests with tiktoken (using Gemma), it reduces around ~30 tokens per request
It may improve model performance by avoiding duplicated start/end tokens
It could make the traffic less detectable by Google, as the message would resemble a more standard chat format

Overall, stripping these redundant tokens seems like a simple optimization with multiple advantages, i write it here since opening a pull request for this really small feature seems a bit pointless and also i wanted a second opinion on the matter

luuquangvu · 2026-04-14T11:36:51Z

@Vigno04 Thank you for your feedback. Regarding why we need to add ChatML tags or unnecessary system hints, you can refer back to previous issues like #59. It might save a few tokens, but it won't work with some clients that require a call tool.

Optimize the codebase by applying Sourcery suggestions

… on exponential backoff

…r a certain period of time

luuquangvu added 30 commits December 4, 2025 11:50

fix: Save generated images to persistent storage

601451a

fix: Remove unused output_image type from ResponseOutputContent a…

893eb6d

…nd update response handling for consistency

fix: Update image URL generation in chat response to use Markdown for…

80462b5

…mat for compatibility

Merge branch 'Nativu5:main' into main

af91c4f

Merge branch 'Nativu5:main' into main

f088b5f

fix: Enhance error handling for full-size image saving and add fallba…

8d49a72

…ck to default size

fix: Use filename as image ID to ensure consistency in generated imag…

d37eae0

…e handling

fix: Enhance tempfile saving by adding custom headers, content-type h…

b9f776d

…andling, and improved extension determination

fix: Handle None input in estimate_tokens and return 0 for empty text

a1bc8e2

refactor: Simplify Gemini model environment variable parsing with JSO…

61c5f3b

…N support - Replaced prefix-based parsing with a root key approach. - Added JSON parsing to handle list-based model configurations. - Improved handling of errors and cleanup of environment variables.

fix: Improve regex patterns in helper module

476b9dd

- Adjusted `TOOL_CALL_RE` regex pattern for better accuracy.

docs: Update README files to include custom model configuration and e…

35c1e99

…nvironment variable setup

fix: Remove unused headers from HTTP client in helper module

9b81621

fix: Update README and README.zh to clarify model configuration via e…

32a48dc

…nvironment variables; enhance error logging in config validation

Update README and README.zh to clarify model configuration via JSON s…

0c00b08

…tring or list structure for enhanced flexibility in automated environments

Merge branch 'Nativu5:main' into main

e2233f4

Refactor: compress JSON content to save tokens and streamline sending…

b599d99

… multiple chunks

Refactor: Modify the LMDB store to fix issues where no conversation i…

186b844

…s found in either the raw or cleaned history.

Refactor: Modify the LMDB store to fix issues where no conversation i…

6dd1fec

…s found.

Refactor: Update all functions to use orjson for better performance

20ed245

Update project dependencies

f67fe63

Fix IDE warnings

889f2d2

Incorrect IDE warnings

66b6202

Refactor: Modify the LMDB store to fix issues where no conversation i…

3297f53

…s found.

Refactor: Centralized the mapping of the 'developer' role to 'system'…

5399b26

… for better Gemini compatibility.

Update dependencies to latest versions

44968c9

luuquangvu added 2 commits April 2, 2026 12:56

Update dependencies to latest versions

cf0dffa

Pinning workflow actions to a full-length commit SHA to comply with s…

2ca6339

…ecurity standards.

luuquangvu added 7 commits April 8, 2026 17:50

Update dependencies to latest versions

b1773c2

Enable Guest mode

19739e9

Continue updating Guest mode

e3c0294

Get account quotas, abuse status

37cd39b

Update the account quotas logic to make it more display-friendly

1a57aeb

Update the account quotas logic to make it more display-friendly

f40e4e0

Update the account quotas logic to make it more display-friendly

86a8dde

Update the account quotas logic to make it more display-friendly

df512c3

luuquangvu added 13 commits April 14, 2026 19:52

Add a background task to send HTTP/2 PING frames

506c78f

Explicitly use HTTP/2 and include SSRF protection

90aefdb

Explicitly use HTTP/2 and include SSRF protection

1156d22

Include the impersonate parameter

e9c7ca7

Optimize the codebase by applying Sourcery suggestions

Periodically check for dead clients in the pool and revive them based…

d6aef65

… on exponential backoff

Explicitly use HTTP/3 with fallback and update libraries

b8cd546

Experiment with resolving the issue of cookies becoming inactive afte…

3230a81

…r a certain period of time

Experiment with resolving the issue of cookies becoming inactive afte…

bf765bc

…r a certain period of time

Update health_check

2b605b4

Experiment with resolving the issue of cookies becoming inactive afte…

7481322

…r a certain period of time

Update README to avoid using chrome based

26f5ad4

Workaround for new device-bound session mechanism

9d56d33

Stop background tasks when the account status is not available

a2f1ebd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor multiple flows to enhance performance, boost scalability, and ensure stability.#131

Refactor multiple flows to enhance performance, boost scalability, and ensure stability.#131
luuquangvu wants to merge 275 commits intoNativu5:mainfrom
luuquangvu:main

luuquangvu commented Mar 21, 2026

Uh oh!

luuquangvu commented Mar 26, 2026

Uh oh!

Vigno04 commented Mar 26, 2026

Uh oh!

luuquangvu commented Mar 26, 2026

Uh oh!

luuquangvu commented Apr 7, 2026

Uh oh!

Vigno04 commented Apr 14, 2026

Uh oh!

luuquangvu commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luuquangvu commented Mar 21, 2026

Uh oh!

luuquangvu commented Mar 26, 2026

Uh oh!

Vigno04 commented Mar 26, 2026

Uh oh!

luuquangvu commented Mar 26, 2026

Uh oh!

luuquangvu commented Apr 7, 2026

Uh oh!

Vigno04 commented Apr 14, 2026

Uh oh!

luuquangvu commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants