-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Refactor filesystem operations to async #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Copilot <[email protected]>
monoxgas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No serious issues, just want to make sure we make sure aiofiles plays nice with UPath.
Also, the readme for aiofiles has commentary about common os modules like mkdir, unlink, etc. that might be worth a look: https://github.com/Tinche/aiofiles
|
@monoxgas yea, def needed distinct s3 support. Got messy with single class so pulled out an s3 filesystem class (and assume we can do the same for other cloud fs's). Factory method keeps the singular tool interface unchanged though. |
|
No serious issues from my end, would just like to see the tests/linting fixed up before we merge. Def understand the struggle underneath with async, support in fsspec is somewhat spotty depending on the provider, and UPath starts to fall apart in terms of value. Edge cases I'm sure are all over, so I think the overloads and class fallback there is a good approach. Long term we can probably return to a few things:
Interesting pattern here as well: https://github.com/iterative/morefs/blob/main/src/morefs/asyn_local.py |
|
Rather than all the various subclassing, have we considered toolset variants? |
|
I think test/linting all set. Only failing type check is for other code in repo that isnt touched here. And happy to circle back on FS tool in long term for noted issues above, documented notes. |
Overview
This PR refactors the
dreadnode/agent/tools/fs.pymodule with two major improvements:Changes
Part 1: Async Transformation
All filesystem methods converted to async:
read_file()- Now usesaiofilesfor local,aioboto3for S3read_lines()- Async file reading with line slicingwrite_file()- Async write operationswrite_file_bytes()- Binary async writeswrite_lines()- Async read-modify-write for line editingls()- Async directory listing viaasyncio.to_thread()glob()- Async pattern matchinggrep()- Parallel file searching with semaphore-based concurrency controlcp()- Async copy operationsmv()- Async move operationsdelete()- Async deletionPart 2: Architecture Refactoring
Class Hierarchy
Before:
After:
Breaking Changes
None. The factory function maintains 100% backward compatibility:
Migration required: Update all
Filesystemmethod calls to useawait:Testing
Local Filesystem: 16/16 tests passing ✅
S3 Filesystem: 11/12 tests passing ✅
Performance Improvements
Dependencies
New required dependencies:
aiofiles- Async local file I/ONew optional dependencies:
aioboto3- Async AWS S3 operations (only needed for S3 paths)