Skip to content

Conversation

@JR-1991
Copy link
Member

@JR-1991 JR-1991 commented Nov 27, 2025

This pull request refactors how HTTP requests are constructed for direct upload operations, simplifying the codebase and improving clarity. The most important changes include removing the build_url utility function in favor of directly passing parameters and headers to HTTP requests, and updating progress bar descriptions to clearly indicate singlepart and multipart uploads.

Refactoring HTTP request construction:

  • Replaced usage of the build_url utility function with direct parameter and header passing in HTTP requests within both directupload.py and dvuploader.py, which simplifies the code and makes it more readable. [1] [2] [3] [4]

  • Updated HTTP client session initialization to include base_url for requests in directupload.py, ensuring requests are correctly scoped to the Dataverse instance. [1] [2]

User experience improvements:

  • Enhanced progress bar descriptions to indicate whether a file is being uploaded as singlepart or multipart, providing clearer feedback to users during uploads. [1] [2]

Updated directupload and dvuploader modules to construct ticket request URLs using query parameters instead of the build_url helper. Also added API token to request headers for direct upload support check.
Set 'base_url' in httpx.AsyncClient for direct uploads to ensure correct request routing. Enhanced progress bar descriptions to indicate singlepart or multipart uploads. Added 'X-Dataverse-key' header in ticket request for authentication.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors HTTP request construction for Dataverse direct upload operations by eliminating the build_url utility function in favor of explicit parameter and header passing. The changes improve code readability and move API token authentication from query parameters to headers (a security best practice). Additionally, user experience is enhanced with progress bar descriptions that distinguish between singlepart and multipart S3 uploads.

Key changes:

  • Replaced build_url() calls with direct params and headers arguments in httpx requests
  • Added base_url parameter to httpx session initialization for proper scoping to Dataverse instances
  • Enhanced progress bars to display upload mode (Singlepart/Multipart) for each file

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
dvuploader/dvuploader.py Refactored _has_direct_upload() to use direct parameter/header passing instead of build_url
dvuploader/directupload.py Removed build_url import, added base_url to session configs, updated progress bar descriptions, and refactored _request_ticket() to use explicit parameters/headers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 218 to 221
url = urljoin(dataverse_url, TICKET_ENDPOINT)

response = await session.get(
url,
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The session is initialized with base_url set to dataverse_url (line 72), so using urljoin(dataverse_url, TICKET_ENDPOINT) creates an absolute URL that bypasses the base_url. This is inconsistent with other functions like check_dataset_lock and _get_dataset_id which use relative URLs.

Consider using just TICKET_ENDPOINT directly:

response = await session.get(
    TICKET_ENDPOINT,
    timeout=None,
    params={
        "size": file_size,
        "persistentId": persistent_id,
    },
    headers={
        "X-Dataverse-key": api_token,
    },
)
Suggested change
url = urljoin(dataverse_url, TICKET_ENDPOINT)
response = await session.get(
url,
response = await session.get(
TICKET_ENDPOINT,

Copilot uses AI. Check for mistakes.
Replaces the full URL with the relative TICKET_ENDPOINT in the _request_ticket function when making the GET request. This change may be intended to rely on the session's base URL configuration for cleaner endpoint management.
@JR-1991 JR-1991 merged commit 43d31ad into main Nov 27, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Development

Successfully merging this pull request may close these issues.

2 participants