Skip to content

Add hypothesis strategy for SGRID dataset generation#2634

Merged
VeckoTheGecko merged 8 commits into
Parcels-code:mainfrom
VeckoTheGecko:push-lwkskpmnkxtu
May 18, 2026
Merged

Add hypothesis strategy for SGRID dataset generation#2634
VeckoTheGecko merged 8 commits into
Parcels-code:mainfrom
VeckoTheGecko:push-lwkskpmnkxtu

Conversation

@VeckoTheGecko
Copy link
Copy Markdown
Contributor

Description

This PR:

  • Adds a hypothesis strategy for SGRID dataset generation under parcels/_datasets/structured/strategies.py
  • Moves the strategies such that they are part of the parcels module (instead of the tests module) - tests.strategies -> parcels._strategies
    • Since the _datasets folder is in the Parcels package, I think this structure makes the most sense
    • Informatively errors out if someone tries to import strategies without having hypothesis installed

This in future will allow us to have more property based testing extending to full simulations - which would be really helpful for both improving the robustness and maintainability of the Parcels test suite.

Checklist

  • Closes None
  • Tests added
  • This PR targets the correct branch (main for normal development, v3-support for v3 support)

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.
    • Describe how you used it (e.g., by pasting your prompt): For writing tests/datasets/test_strategies.py

Copy link
Copy Markdown
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; just two questions below

Comment on lines +3 to +9
try:
import hypothesis # noqa: F401
except ImportError as err:
err.add_note(
"To use strategies you must have hypothesis installed. Install it from PyPI, Conda, or using your preffered package manager."
)
raise err
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Wouldn't hypothesis simply be part of our Pixi install?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its part of our Pixi install, but its not a Parcels run dependency (i.e., its not in pixi.toml::run-dependencies or the recipe.yaml/pyproject.toml.

This is our first "optional dependency" for Parcels (i.e., a part of the codebase that needs a specific package in order to fulfill a function, but where it doesn't make sense to include it for everyone since most people wont use the specific function).

People doing conda install parcels then import parcels._strategies will encounter this more informative error message.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see. But then (in a next PR?) fix the type-o preffered to preferred?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is just good practice, but I'm surprised that the strategies are moved out of the tests directory. Would they not more logically belong there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - good question.

Most of our dataset generation code we have shipping with Parcels via parcels._datasets. This means that users can have easy access to Parcels-compatible example datasets and dataset generation for posting issues etc. . When adding strategies for creating example datasets, I needed to use existing strategies that were in the tests module. Since I cant put code in parcels that depends on the tests module - the options I saw were:

  1. Leave the strategies where they are in tests and carve out a part in tests/... for housing these dataset strategies. This would mean that some datasets are used via parcels/_datasets/... and some via tests/... which I felt was confusing for devs
  2. Move the dataset generation to tests - foregoing the benefit of users having easy access to datasets
  3. Move the strategies to parcels

I thought (3) was the best. It might seem weird to be putting "test" code in the parcels release, but its actually not that weird - some projects even put their whole test suites in the release itself.

@VeckoTheGecko VeckoTheGecko merged commit d05741d into Parcels-code:main May 18, 2026
16 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Parcels development May 18, 2026
@VeckoTheGecko VeckoTheGecko deleted the push-lwkskpmnkxtu branch May 18, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants