Skip to content

Conversation

@xieus
Copy link
Collaborator

@xieus xieus commented Jan 17, 2026

Pull Request Description

This PR refactors downloader to be non-blocking, and refactors artifact_service to use non-blocking cleanup. It also improves type safety in status.py, and fixes minor issues in downloader and gcp_test.

Context
The AIBrix runtime is designed as an aysnc service. It seems that a few core components—specifically the artifact downloaders and cleanup services—use synchronous, blocking libraries (like boto3, google-cloud-storage, and shutil).

In an asynchronous event loop (like the one used by uvicorn or asyncio), calling a blocking function freezes the entire loop. This means while a 5GB model is downloading from S3, the service cannot respond to heartbeats, health checks, or other API requests.

Therefore we propose this minor refactor to enhance performance.

Related Issues

Resolves: #[N/A]

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @xieus, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the responsiveness and stability of the AIBrix runtime by converting blocking I/O operations in critical components, such as artifact downloading and cleanup, into non-blocking asynchronous calls. This change prevents the service's event loop from stalling, especially during large data transfers, thereby improving overall service availability and performance. The refactoring is complemented by the addition of robust unit tests for the downloader services and minor fixes to existing tests and type definitions.

Highlights

  • Asynchronous I/O Refactoring: The core artifact downloaders (S3, GCS, HuggingFace) and the artifact cleanup service have been refactored to use non-blocking I/O operations, leveraging asyncio.run_in_executor to prevent the main event loop from freezing during long-running tasks like large file downloads or directory deletions.
  • New Unit Tests for Downloaders: Comprehensive unit tests have been added for all artifact downloader implementations (S3, GCS, HuggingFace, HTTP) to ensure their correct asynchronous behavior and error handling.
  • Improved Type Safety in Status Module: The Status class in status.py has been updated with explicit is_ methods for various error codes, replacing a dynamic __getattr__ implementation for better type safety and code clarity.
  • GCP Test Enhancement: The gcp_test.go file now includes a 30-second timeout for OpenAI API chat completion calls, improving test reliability and preventing indefinite waits.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a key issue by refactoring blocking I/O operations in artifact_service and downloaders to be non-blocking, which is crucial for the performance of an async service. The introduction of loop.run_in_executor is the correct approach for this. The addition of a comprehensive unit test suite for the downloaders in test_downloaders.py is a significant improvement and greatly increases confidence in the changes. Furthermore, the refactoring in status.py to replace dynamic __getattr__ with explicit methods is a great move for improving type safety and code clarity.

My review includes a couple of suggestions on downloaders.py to further improve performance by parallelizing directory downloads, which currently happen sequentially. Overall, this is a high-quality pull request that significantly improves the robustness and performance of the system.

Comment on lines 61 to +71
async def download(
self, source_url: str, local_path: str, credentials: Optional[Dict] = None
) -> str:
"""
Download from S3 (Async wrapper).
"""
loop = asyncio.get_running_loop()
return await loop.run_in_executor(
None,
functools.partial(self._download_sync, source_url, local_path, credentials),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change correctly makes the download operation non-blocking by running the synchronous logic in a thread pool executor. This prevents blocking the asyncio event loop.

However, for directory downloads, the files are still processed sequentially within the _download_sync method. To further enhance performance, especially for directories with many small files, you could consider parallelizing the individual file downloads.

A possible approach would be to refactor the logic to:

  1. List all files in the directory within run_in_executor.
  2. In the async download method, create a list of tasks, one for each file, using loop.run_in_executor to call s3_client.download_file.
  3. Use asyncio.gather to run these download tasks concurrently.

This would make directory downloads significantly faster.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us make it non-blocking and test the performance before any further optimization.

Comment on lines 183 to +193
async def download(
self, source_url: str, local_path: str, credentials: Optional[Dict] = None
) -> str:
"""
Download from GCS (Async wrapper).
"""
loop = asyncio.get_running_loop()
return await loop.run_in_executor(
None,
functools.partial(self._download_sync, source_url, local_path, credentials),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the S3 downloader, this change correctly makes the GCS download operation non-blocking.

For directory downloads, the implementation currently downloads files sequentially. For a performance boost, especially with directories containing numerous files, consider parallelizing these downloads. You could adapt the logic to list all blobs first, and then use asyncio.gather with loop.run_in_executor to download the files concurrently.

@xieus xieus changed the title [Bug] Refactor downloader and artifact_service to be non-blocking, add UTs [Bug] Refactor downloader and artifact_service to be non-blocking Jan 19, 2026
if os.path.exists(local_path):
try:
shutil.rmtree(local_path)
loop = asyncio.get_running_loop()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use python 3.10, you can use await asyncio.to_thread(shutil.rmtree, local_path) to replace 342 and 343

def test_get_downloader_hf_not_exist():
with pytest.raises(ModelNotFoundError) as exception:
get_downloader("not_exsit_path/model")
get_downloader("not_exist_path/model")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

@Jeffwan Jeffwan merged commit 49216ae into main Jan 31, 2026
15 checks passed
@Jeffwan Jeffwan deleted the xieus/dev branch January 31, 2026 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants