[Bug] Refactor downloader and artifact_service to be non-blocking #1895

xieus · 2026-01-17T18:53:27Z

Pull Request Description

This PR refactors downloader to be non-blocking, and refactors artifact_service to use non-blocking cleanup. It also improves type safety in status.py, and fixes minor issues in downloader and gcp_test.

Context
The AIBrix runtime is designed as an aysnc service. It seems that a few core components—specifically the artifact downloaders and cleanup services—use synchronous, blocking libraries (like boto3, google-cloud-storage, and shutil).

In an asynchronous event loop (like the one used by uvicorn or asyncio), calling a blocking function freezes the entire loop. This means while a 5GB model is downloading from S3, the service cannot respond to heartbeats, health checks, or other API requests.

Therefore we propose this minor refactor to enhance performance.

Related Issues

Resolves: #[N/A]

Important: Before submitting, please complete the description above and review the checklist below.

Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

[Bug]: Corrections to existing functionality
[CI]: Changes to build process or CI pipeline
[Docs]: Updates or additions to documentation
[API]: Modifications to aibrix's API or interface
[CLI]: Changes or additions to the Command Line Interface
[Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

PR title includes appropriate prefix(es)
Changes are clearly explained in the PR description
New and existing tests pass successfully
Code adheres to project style and best practices
Documentation updated to reflect changes (if applicable)
Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

gemini-code-assist · 2026-01-17T18:53:45Z

Summary of Changes

Hello @xieus, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the responsiveness and stability of the AIBrix runtime by converting blocking I/O operations in critical components, such as artifact downloading and cleanup, into non-blocking asynchronous calls. This change prevents the service's event loop from stalling, especially during large data transfers, thereby improving overall service availability and performance. The refactoring is complemented by the addition of robust unit tests for the downloader services and minor fixes to existing tests and type definitions.

Highlights

Asynchronous I/O Refactoring: The core artifact downloaders (S3, GCS, HuggingFace) and the artifact cleanup service have been refactored to use non-blocking I/O operations, leveraging asyncio.run_in_executor to prevent the main event loop from freezing during long-running tasks like large file downloads or directory deletions.
New Unit Tests for Downloaders: Comprehensive unit tests have been added for all artifact downloader implementations (S3, GCS, HuggingFace, HTTP) to ensure their correct asynchronous behavior and error handling.
Improved Type Safety in Status Module: The Status class in status.py has been updated with explicit is_ methods for various error codes, replacing a dynamic __getattr__ implementation for better type safety and code clarity.
GCP Test Enhancement: The gcp_test.go file now includes a 30-second timeout for OpenAI API chat completion calls, improving test reliability and preventing indefinite waits.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively addresses a key issue by refactoring blocking I/O operations in artifact_service and downloaders to be non-blocking, which is crucial for the performance of an async service. The introduction of loop.run_in_executor is the correct approach for this. The addition of a comprehensive unit test suite for the downloaders in test_downloaders.py is a significant improvement and greatly increases confidence in the changes. Furthermore, the refactoring in status.py to replace dynamic __getattr__ with explicit methods is a great move for improving type safety and code clarity.

My review includes a couple of suggestions on downloaders.py to further improve performance by parallelizing directory downloads, which currently happen sequentially. Overall, this is a high-quality pull request that significantly improves the robustness and performance of the system.

gemini-code-assist · 2026-01-17T18:54:56Z

python/aibrix/aibrix/runtime/downloaders.py

    async def download(
        self, source_url: str, local_path: str, credentials: Optional[Dict] = None
+    ) -> str:
+        """
+        Download from S3 (Async wrapper).
+        """
+        loop = asyncio.get_running_loop()
+        return await loop.run_in_executor(
+            None,
+            functools.partial(self._download_sync, source_url, local_path, credentials),
+        )


This change correctly makes the download operation non-blocking by running the synchronous logic in a thread pool executor. This prevents blocking the asyncio event loop.

However, for directory downloads, the files are still processed sequentially within the _download_sync method. To further enhance performance, especially for directories with many small files, you could consider parallelizing the individual file downloads.

A possible approach would be to refactor the logic to:

List all files in the directory within run_in_executor.

In the async download method, create a list of tasks, one for each file, using loop.run_in_executor to call s3_client.download_file.

Use asyncio.gather to run these download tasks concurrently.

This would make directory downloads significantly faster.

Let us make it non-blocking and test the performance before any further optimization.

gemini-code-assist · 2026-01-17T18:54:56Z

python/aibrix/aibrix/runtime/downloaders.py

    async def download(
        self, source_url: str, local_path: str, credentials: Optional[Dict] = None
+    ) -> str:
+        """
+        Download from GCS (Async wrapper).
+        """
+        loop = asyncio.get_running_loop()
+        return await loop.run_in_executor(
+            None,
+            functools.partial(self._download_sync, source_url, local_path, credentials),
+        )


Similar to the S3 downloader, this change correctly makes the GCS download operation non-blocking.

For directory downloads, the implementation currently downloads files sequentially. For a performance boost, especially with directories containing numerous files, consider parallelizing these downloads. You could adapt the logic to list all blobs first, and then use asyncio.gather with loop.run_in_executor to download the files concurrently.

python/aibrix/aibrix/runtime/test_downloaders.py

python/aibrix_kvcache/aibrix_kvcache/status.py

Jeffwan · 2026-01-20T03:55:05Z

python/aibrix/aibrix/runtime/artifact_service.py

        if os.path.exists(local_path):
            try:
-                shutil.rmtree(local_path)
+                loop = asyncio.get_running_loop()


Since we use python 3.10, you can use await asyncio.to_thread(shutil.rmtree, local_path) to replace 342 and 343

deployment/terraform/tests/gcp_test.go

Jeffwan · 2026-01-20T03:56:10Z

python/aibrix/tests/downloader/test_downloader_hf.py

 def test_get_downloader_hf_not_exist():
    with pytest.raises(ModelNotFoundError) as exception:
-        get_downloader("not_exsit_path/model")
+        get_downloader("not_exist_path/model")


Signed-off-by: Liguang Xie <[email protected]>

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

github-code-quality bot found potential problems Jan 17, 2026

View reviewed changes

python/aibrix/aibrix/runtime/test_downloaders.py Fixed Show fixed Hide fixed

python/aibrix/aibrix/runtime/test_downloaders.py Fixed Show fixed Hide fixed

python/aibrix/aibrix/runtime/test_downloaders.py Fixed Show fixed Hide fixed

xieus changed the title ~~[Bug] Refactor downloader and artifact_service to be non-blocking, add UTs~~ [Bug] Refactor downloader and artifact_service to be non-blocking Jan 19, 2026

Jeffwan requested changes Jan 20, 2026

View reviewed changes

xieus force-pushed the xieus/dev branch from fe245a4 to 08cf7dd Compare January 25, 2026 20:22

Jeffwan force-pushed the xieus/dev branch from 62de6ec to 1f194fd Compare January 28, 2026 06:07

Jeffwan approved these changes Jan 28, 2026

View reviewed changes

Jeffwan force-pushed the xieus/dev branch from 1f194fd to f0d0125 Compare January 31, 2026 22:03

Liguang Xie and others added 7 commits January 31, 2026 14:05

Downloader minor refactor and gcp_test add timeout setting

f6cd64b

Signed-off-by: Liguang Xie <[email protected]>

Refactor downloaders and artifact_service to be non-blocking

4beb9e6

Signed-off-by: Liguang Xie <[email protected]>

Add unit tests for downloader

ba9d789

Signed-off-by: Liguang Xie <[email protected]>

Remove unused libs

4385f15

Signed-off-by: Liguang Xie <[email protected]>

Move test_downloaders.py

05be2d4

Signed-off-by: Liguang Xie <[email protected]>

Fix minor typos in downloader tests

64f3de4

Signed-off-by: Liguang Xie <[email protected]>

Recover KV Cache status func

2211a9b

Signed-off-by: Liguang Xie <[email protected]>

Jeffwan force-pushed the xieus/dev branch from f0d0125 to 2211a9b Compare January 31, 2026 22:05

Jeffwan merged commit 49216ae into main Jan 31, 2026
15 checks passed

Jeffwan deleted the xieus/dev branch January 31, 2026 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Refactor downloader and artifact_service to be non-blocking #1895

[Bug] Refactor downloader and artifact_service to be non-blocking #1895

Uh oh!

xieus commented Jan 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 17, 2026

Uh oh!

xieus Jan 19, 2026

Uh oh!

gemini-code-assist bot Jan 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jeffwan Jan 20, 2026

Uh oh!

Uh oh!

Jeffwan Jan 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Bug] Refactor downloader and artifact_service to be non-blocking #1895

[Bug] Refactor downloader and artifact_service to be non-blocking #1895

Uh oh!

Conversation

xieus commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

Related Issues

Pull Request Title Format

Submission Checklist

Uh oh!

gemini-code-assist bot commented Jan 17, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

xieus Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jeffwan Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jeffwan Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xieus commented Jan 17, 2026 •

edited

Loading