-
Notifications
You must be signed in to change notification settings - Fork 524
[Bug] Refactor downloader and artifact_service to be non-blocking #1895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @xieus, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the responsiveness and stability of the AIBrix runtime by converting blocking I/O operations in critical components, such as artifact downloading and cleanup, into non-blocking asynchronous calls. This change prevents the service's event loop from stalling, especially during large data transfers, thereby improving overall service availability and performance. The refactoring is complemented by the addition of robust unit tests for the downloader services and minor fixes to existing tests and type definitions. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses a key issue by refactoring blocking I/O operations in artifact_service and downloaders to be non-blocking, which is crucial for the performance of an async service. The introduction of loop.run_in_executor is the correct approach for this. The addition of a comprehensive unit test suite for the downloaders in test_downloaders.py is a significant improvement and greatly increases confidence in the changes. Furthermore, the refactoring in status.py to replace dynamic __getattr__ with explicit methods is a great move for improving type safety and code clarity.
My review includes a couple of suggestions on downloaders.py to further improve performance by parallelizing directory downloads, which currently happen sequentially. Overall, this is a high-quality pull request that significantly improves the robustness and performance of the system.
| async def download( | ||
| self, source_url: str, local_path: str, credentials: Optional[Dict] = None | ||
| ) -> str: | ||
| """ | ||
| Download from S3 (Async wrapper). | ||
| """ | ||
| loop = asyncio.get_running_loop() | ||
| return await loop.run_in_executor( | ||
| None, | ||
| functools.partial(self._download_sync, source_url, local_path, credentials), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change correctly makes the download operation non-blocking by running the synchronous logic in a thread pool executor. This prevents blocking the asyncio event loop.
However, for directory downloads, the files are still processed sequentially within the _download_sync method. To further enhance performance, especially for directories with many small files, you could consider parallelizing the individual file downloads.
A possible approach would be to refactor the logic to:
- List all files in the directory within
run_in_executor. - In the
async downloadmethod, create a list of tasks, one for each file, usingloop.run_in_executorto calls3_client.download_file. - Use
asyncio.gatherto run these download tasks concurrently.
This would make directory downloads significantly faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us make it non-blocking and test the performance before any further optimization.
| async def download( | ||
| self, source_url: str, local_path: str, credentials: Optional[Dict] = None | ||
| ) -> str: | ||
| """ | ||
| Download from GCS (Async wrapper). | ||
| """ | ||
| loop = asyncio.get_running_loop() | ||
| return await loop.run_in_executor( | ||
| None, | ||
| functools.partial(self._download_sync, source_url, local_path, credentials), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the S3 downloader, this change correctly makes the GCS download operation non-blocking.
For directory downloads, the implementation currently downloads files sequentially. For a performance boost, especially with directories containing numerous files, consider parallelizing these downloads. You could adapt the logic to list all blobs first, and then use asyncio.gather with loop.run_in_executor to download the files concurrently.
| if os.path.exists(local_path): | ||
| try: | ||
| shutil.rmtree(local_path) | ||
| loop = asyncio.get_running_loop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use python 3.10, you can use await asyncio.to_thread(shutil.rmtree, local_path) to replace 342 and 343
| def test_get_downloader_hf_not_exist(): | ||
| with pytest.raises(ModelNotFoundError) as exception: | ||
| get_downloader("not_exsit_path/model") | ||
| get_downloader("not_exist_path/model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Signed-off-by: Liguang Xie <[email protected]>
Pull Request Description
This PR refactors downloader to be non-blocking, and refactors artifact_service to use non-blocking cleanup. It also improves type safety in status.py, and fixes minor issues in downloader and gcp_test.
Context
The AIBrix runtime is designed as an aysnc service. It seems that a few core components—specifically the artifact downloaders and cleanup services—use synchronous, blocking libraries (like boto3, google-cloud-storage, and shutil).
In an asynchronous event loop (like the one used by uvicorn or asyncio), calling a blocking function freezes the entire loop. This means while a 5GB model is downloading from S3, the service cannot respond to heartbeats, health checks, or other API requests.
Therefore we propose this minor refactor to enhance performance.
Related Issues
Resolves: #[N/A]
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.