Make it possible to exclude pyarrow dep#276
Merged
orangetin merged 5 commits intotogethercomputer:mainfrom Jun 2, 2025
Merged
Conversation
This allows client to exclude the pyarrow dep if they don't need it. Saved ~80MB and more compatible with older systems. Will still get a runtime error if they exclude it, then try to use it. Still works as expected unless users go out of their way to manually exclude this dependency (I'm not removing the dep, you need to manually exclude it).
Member
|
@azahed98 @artek0chumak could you review this? |
Contributor
Author
|
@orangetin I'd love to get this reviewed and integrated (or hear it's not going to make it so I can maintain my fork). Should be a quick 2 min review if you know the right folks. |
orangetin
requested changes
May 5, 2025
Member
orangetin
left a comment
There was a problem hiding this comment.
thanks for the PR! i'd like some changes before we can merge this:
- Move pyarrow an optional dependency in a new group in the pyproject.toml file so it doesn't get installed by default
- Add the try/except wrapper (see comment below)
- Add a small note in the readme about this
src/together/utils/files.py
Outdated
|
|
||
| def _check_parquet(file: Path) -> Dict[str, Any]: | ||
| # in method import - this allows client to exclude the pyarrow dep if they don't need it. Saved ~80MB and more compatible with older systems. | ||
| from pyarrow import ArrowInvalid, parquet |
Member
There was a problem hiding this comment.
can you wrap this in a try/except with details on how to install this with the dependency group? something like pip install together[parquet]
… to use parquet files. Example Error ``` $ uv run python test_pyarrow.py Expected ImportError: pyarrow is not installed and is required to use parquet files. Please install it via `pip install together[pyarrow]` ``` Confirmed installing resolves issue: ``` uv pip install "dist/together-1.5.0-py3-none-any.whl[pyarrow]" Resolved 33 packages in 394ms Installed 1 package in 30ms + pyarrow==20.0.0 ```
Contributor
Author
|
@orangetin made those changes. It should be ready. |
Member
Contributor
Author
|
@orangetin done! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #274
pyarrow has a few issues:
This change allows client to exclude the pyarrow dep if they don't need it. It's only used for parquet file validation, which isn't needed by all users.
Note: I'm not removing the dependency- just making it run-time import. It still works as expected for all users, unless users go out of their way to manually exclude this dependency.
Have you read the Contributing Guidelines?
yes
Issue # #274