Skip to content

Conversation

@candleindark
Copy link
Member

@candleindark candleindark commented Dec 5, 2025

This PR reimplement the helper function dandi.utils.is_url so that it can determine whether a given string is a standard HTTP, HTTPS, FTP URL, or DANDI URL more precisely. Additionally, this solution supports DANDI URLs of different DANDI archive instances.

Remaining TODOs:

  • Add tests

@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 85.71429% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.07%. Comparing base (7389791) to head (b8d1379).
⚠️ Report is 3 commits behind head on remove-hardcode.

Files with missing lines Patch % Lines
dandi/utils.py 78.57% 3 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                 @@
##           remove-hardcode    #1767      +/-   ##
===================================================
+ Coverage            75.04%   75.07%   +0.02%     
===================================================
  Files                   84       84              
  Lines                11887    11905      +18     
===================================================
+ Hits                  8921     8938      +17     
- Misses                2966     2967       +1     
Flag Coverage Δ
unittests 75.07% <85.71%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yarikoptic
Copy link
Member

What prompted this?

@candleindark
Copy link
Member Author

What prompted this?

It is target to merge into #1760. The current implementation only handles hardcoded DANDI urls started with "dandi:".

I will have different PRs merged into #1760 since each one of them warrant a separate judgement.

@candleindark candleindark force-pushed the remove-hardcode-modify-is_url branch 2 times, most recently from 9764a26 to 492e74e Compare December 8, 2025 21:56
@candleindark candleindark requested a review from Copilot December 8, 2025 21:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reimplements the is_url helper function to provide more precise URL validation for HTTP, HTTPS, FTP, and DANDI URLs, with support for different DANDI archive instances. The previous implementation used simple string prefix matching, while the new version uses pydantic's URL validators and DANDI's URL parser.

Key changes:

  • Replaced string prefix checking with pydantic TypeAdapter for standard URL validation
  • Integrated DANDI URL validation through parse_dandi_url for custom DANDI URL formats
  • Added comprehensive test coverage with 41 test cases covering valid and invalid URLs

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
dandi/utils.py Reimplements is_url() function using pydantic validators and parse_dandi_url() for more precise URL detection; adds module-level _url_adapter for performance
dandi/tests/test_utils.py Adds comprehensive test suite with TestIsUrl class covering HTTP/HTTPS/FTP URLs, DANDI identifiers, and invalid URL formats

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

So that it can determine whether a given
string is a standard HTTP, HTTPS, FTP URL,
or DANDI URL more precisely. Additionally, this
solution supports DANDI URL of different DANDI
archive instances.
@candleindark candleindark force-pushed the remove-hardcode-modify-is_url branch from 492e74e to b8d1379 Compare December 8, 2025 23:10
@candleindark candleindark marked this pull request as ready for review December 9, 2025 00:32
@candleindark
Copy link
Member Author

@yarikoptic This one is ready for review. I don't know what label is appropriate since it is to be merged to #1760.

@yarikoptic yarikoptic added the patch Increment the patch version when merged label Dec 11, 2025
@yarikoptic yarikoptic merged commit 1df204c into remove-hardcode Dec 11, 2025
35 of 36 checks passed
@yarikoptic yarikoptic deleted the remove-hardcode-modify-is_url branch December 11, 2025 12:06
@github-project-automation github-project-automation bot moved this from In Progress to Done in Multi-DANDI instance support Dec 11, 2025
yarikoptic pushed a commit that referenced this pull request Dec 15, 2025
* test: update `test_update_dandiset_from_doi`

So that the checks of `id` and `identifier`
in metadata allow different instance names

* test: update `test_download_dandiset_yaml`

So that the check of `id` in metadata allows
different instance names

* feat: remove block that fetch dandiset identifier that is irrelevant

See #1760 (comment)
for details

* doc: replace hardcoded "DANDI:" comment

To reflect that instance names are no longer
restricted to "DANDI"

* fix: strip instance name prefix from dandiset ID in `update_dandiset_from_doi`

This is only a replacement for the code that strips
the "DANDI:" prefix. The replacement works for dandisets
of different DANDI instances. However, `update_dandiset_from_doi`
is far from robust. This replacement doesn't fix underlying
weakness stems from assumptions. Further improvements of
`update_dandiset_from_doi` are needed.

* Reimplement `dandi.utils.is_url` (#1767)

* feat: reimplement `dandi.utils.is_url`

So that it can determine whether a given
string is a standard HTTP, HTTPS, FTP URL,
or DANDI URL more precisely. Additionally, this
solution supports DANDI URL of different DANDI
archive instances.

* test: Add tests for `dandi.utils.is_url`

* ci: test against vendored dandiapi other than default (#1771)

Configuer CI to test against a vendor specific DANDI API
other than the default instance of DANDI API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Increment the patch version when merged

Projects

Development

Successfully merging this pull request may close these issues.

3 participants