error handling and user experience when connecting to remote storage systems#1249
Open
Aditya-vegi wants to merge 2 commits intomalariagen:masterfrom
Open
error handling and user experience when connecting to remote storage systems#1249Aditya-vegi wants to merge 2 commits intomalariagen:masterfrom
Aditya-vegi wants to merge 2 commits intomalariagen:masterfrom
Conversation
Enhance error handling for storage connection issues, including authentication errors with Google Cloud.
Improve storage initialization robustness and enhance authentication error handling - Add default handling for storage_options - Strengthen exception handling for filesystem initialization - Provide clearer, context-aware error messages for authentication and permission failures - Improve developer experience by guiding users toward actionable fixes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enhance error handling for storage connection issues, includ📌 Description
This PR improves error handling and user experience when connecting to remote storage systems, especially for Google Cloud Storage (GCS).
It addresses issues where users encounter failures such as:
credential propagation was unsuccessful
⚠️ Notes
storage connection initialization errors
🚀 Changes Made
Added better exception handling around _init_filesystem
Improved error messaging to clearly indicate:
authentication failures
permission issues
Ensured storage_options defaults to an empty dictionary when not provided
Provided clearer guidance for users when access to restricted datasets is denied
🛠️ Code Changes
if storage_options is None:
storage_options = dict()
try:
self._fs, self._base_path = _init_filesystem(self._url, **storage_options)
except (OSError, ImportError) as exc:
raise IOError(
"An error occurred establishing a connection to the storage system. "
"This may be due to authentication failure or insufficient permissions."
) from exc
🧪 Testing
Tested in Google Colab environment
Verified behavior when:
user is not authenticated
user lacks dataset permissions
Confirmed appropriate error messages are raised
Some datasets (e.g., MalariaGEN) require prior access approval
Users must authenticate using:
from google.colab import auth
auth.authenticate_user()
🎯 Impact
Improves debugging experience for users
Reduces confusion around authentication vs permission errors
Makes the system more robust and user-friendlying authentication errors with Google Cloud.