Implement _get_file_stats utility for consistent file metadata retrieval#1253
Open
varma1221 wants to merge 3 commits intomalariagen:masterfrom
Open
Implement _get_file_stats utility for consistent file metadata retrieval#1253varma1221 wants to merge 3 commits intomalariagen:masterfrom
varma1221 wants to merge 3 commits intomalariagen:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this solve?
Currently, retrieving basic file metadata (like size or protocol) requires repeating
_init_filesystemandfsspecboilerplate across different modules. This leads to inconsistent handling of return types especially for the protocol attribute, which some backends return as a string and others as a list and lacks centralized validation for mandatory fields like file size.How does it solve it?
This PR introduces a private utility
_get_file_statsinutil.py. It:size,mtime, andprotocolinto a single dictionary.fs.protocolinto a single string regardless of the backend implementation.ValueErrorinstead of returningNone, which prevents hard-to-debug crashes in downstream mathematicaloperations.
Relevant issue numbers
Part of ongoing improvements to internal I/O utilities.
Testing done
tests/test_util.py.ValueError.FileNotFoundErrorpropagates correctly.mypyandpre-committo ensure type safety and code quality.Breaking changes
None. This is an internal utility addition.