-
Notifications
You must be signed in to change notification settings - Fork 21
Make it possible to exclude pyarrow dep #276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This allows client to exclude the pyarrow dep if they don't need it. Saved ~80MB and more compatible with older systems. Will still get a runtime error if they exclude it, then try to use it. Still works as expected unless users go out of their way to manually exclude this dependency (I'm not removing the dep, you need to manually exclude it).
|
@azahed98 @artek0chumak could you review this? |
|
@orangetin I'd love to get this reviewed and integrated (or hear it's not going to make it so I can maintain my fork). Should be a quick 2 min review if you know the right folks. |
orangetin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR! i'd like some changes before we can merge this:
- Move pyarrow an optional dependency in a new group in the pyproject.toml file so it doesn't get installed by default
- Add the try/except wrapper (see comment below)
- Add a small note in the readme about this
src/together/utils/files.py
Outdated
|
|
||
| def _check_parquet(file: Path) -> Dict[str, Any]: | ||
| # in method import - this allows client to exclude the pyarrow dep if they don't need it. Saved ~80MB and more compatible with older systems. | ||
| from pyarrow import ArrowInvalid, parquet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you wrap this in a try/except with details on how to install this with the dependency group? something like pip install together[parquet]
… to use parquet files. Example Error ``` $ uv run python test_pyarrow.py Expected ImportError: pyarrow is not installed and is required to use parquet files. Please install it via `pip install together[pyarrow]` ``` Confirmed installing resolves issue: ``` uv pip install "dist/together-1.5.0-py3-none-any.whl[pyarrow]" Resolved 33 packages in 394ms Installed 1 package in 30ms + pyarrow==20.0.0 ```
|
@orangetin made those changes. It should be ready. |
orangetin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
|
@orangetin done! |
Fixes #274
pyarrow has a few issues:
This change allows client to exclude the pyarrow dep if they don't need it. It's only used for parquet file validation, which isn't needed by all users.
Note: I'm not removing the dependency- just making it run-time import. It still works as expected for all users, unless users go out of their way to manually exclude this dependency.
Have you read the Contributing Guidelines?
yes
Issue # #274