-
Notifications
You must be signed in to change notification settings - Fork 6
Improve URL building and display S3 upload mode per file #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Updated directupload and dvuploader modules to construct ticket request URLs using query parameters instead of the build_url helper. Also added API token to request headers for direct upload support check.
Set 'base_url' in httpx.AsyncClient for direct uploads to ensure correct request routing. Enhanced progress bar descriptions to indicate singlepart or multipart uploads. Added 'X-Dataverse-key' header in ticket request for authentication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request refactors HTTP request construction for Dataverse direct upload operations by eliminating the build_url utility function in favor of explicit parameter and header passing. The changes improve code readability and move API token authentication from query parameters to headers (a security best practice). Additionally, user experience is enhanced with progress bar descriptions that distinguish between singlepart and multipart S3 uploads.
Key changes:
- Replaced
build_url()calls with directparamsandheadersarguments in httpx requests - Added
base_urlparameter to httpx session initialization for proper scoping to Dataverse instances - Enhanced progress bars to display upload mode (Singlepart/Multipart) for each file
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| dvuploader/dvuploader.py | Refactored _has_direct_upload() to use direct parameter/header passing instead of build_url |
| dvuploader/directupload.py | Removed build_url import, added base_url to session configs, updated progress bar descriptions, and refactored _request_ticket() to use explicit parameters/headers |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dvuploader/directupload.py
Outdated
| url = urljoin(dataverse_url, TICKET_ENDPOINT) | ||
|
|
||
| response = await session.get( | ||
| url, |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The session is initialized with base_url set to dataverse_url (line 72), so using urljoin(dataverse_url, TICKET_ENDPOINT) creates an absolute URL that bypasses the base_url. This is inconsistent with other functions like check_dataset_lock and _get_dataset_id which use relative URLs.
Consider using just TICKET_ENDPOINT directly:
response = await session.get(
TICKET_ENDPOINT,
timeout=None,
params={
"size": file_size,
"persistentId": persistent_id,
},
headers={
"X-Dataverse-key": api_token,
},
)| url = urljoin(dataverse_url, TICKET_ENDPOINT) | |
| response = await session.get( | |
| url, | |
| response = await session.get( | |
| TICKET_ENDPOINT, |
Replaces the full URL with the relative TICKET_ENDPOINT in the _request_ticket function when making the GET request. This change may be intended to rely on the session's base URL configuration for cleaner endpoint management.
This pull request refactors how HTTP requests are constructed for direct upload operations, simplifying the codebase and improving clarity. The most important changes include removing the
build_urlutility function in favor of directly passing parameters and headers to HTTP requests, and updating progress bar descriptions to clearly indicate singlepart and multipart uploads.Refactoring HTTP request construction:
Replaced usage of the
build_urlutility function with direct parameter and header passing in HTTP requests within bothdirectupload.pyanddvuploader.py, which simplifies the code and makes it more readable. [1] [2] [3] [4]Updated HTTP client session initialization to include
base_urlfor requests indirectupload.py, ensuring requests are correctly scoped to the Dataverse instance. [1] [2]User experience improvements: