Skip to content

Root attributes#5

Merged
Claptar merged 7 commits intodevfrom
root_attributes
Mar 23, 2026
Merged

Root attributes#5
Claptar merged 7 commits intodevfrom
root_attributes

Conversation

@Claptar
Copy link
Contributor

@Claptar Claptar commented Mar 23, 2026

PR: Enforce AnnData root attrs and make subset outputs concat-friendly

Title

Enforce AnnData root attrs and make subset outputs concat-friendly

Summary

This PR updates subset/store behavior to improve AnnData schema compliance and downstream compatibility with disk-based concat workflows.

  • Root AnnData metadata is now enforced on writable store opens and copy operations.
  • Read mode emits a warning (not an error) when root attrs are missing.
  • Subset outputs now always include optional empty AnnData groups (layers, obsm, obsp, varm, varp) so generated files are structurally closer to full AnnData outputs.
  • Regression coverage added for root-attr handling and optional-group creation.

Files Changed

  • src/h5ad/storage/__init__.py
  • src/h5ad/core/subset.py
  • tests/test_storage_root_attrs.py
  • tests/test_subset.py
  • uv.lock

Validation

Targeted tests run:

/lustre/scratch124/cellgen/cellgeni/aljes/h5ad/.venv/bin/python -m pytest tests/test_storage_root_attrs.py tests/test_zarr.py -q
/lustre/scratch124/cellgen/cellgeni/aljes/h5ad/.venv/bin/python -m pytest tests/test_subset.py tests/test_zarr.py -q

Both passed.

Commits in this PR (dev..root_attributes)

  • 2bce366 Add function to ensure optional anndata groups in subset operations
  • 8553664 Add functions to ensure and validate AnnData root attributes in store operations
  • 796e74c Add tests for AnnData root encoding attributes enforcement and warnings
  • 4d44e46 Add test for optional empty groups in subset_h5ad function
  • 1911cdc Refactor test_subset_h5ad to improve readability and ensure optional empty groups are included in the subset output
  • a62f6e0 Bump h5ad package version to 0.3.1

@Claptar Claptar self-assigned this Mar 23, 2026
Copilot AI review requested due to automatic review settings March 23, 2026 16:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens AnnData schema compliance by enforcing required root-level encoding attributes on writable store opens/copies, while keeping read-mode tolerant via warnings, and makes subset outputs more compatible with downstream concat workflows by always materializing optional empty AnnData groups.

Changes:

  • Enforce encoding-type / encoding-version root attributes on writable open_store(...) and in copy_store_contents(...); emit a warning (not an error) on read when missing/invalid.
  • Ensure subset outputs always contain optional empty AnnData groups: layers, obsm, obsp, varm, varp.
  • Add regression tests for root-attr enforcement/warnings and optional-group creation in subset output; bump locked package version.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/h5ad/storage/__init__.py Adds root AnnData attr validation/enforcement and warning behavior; ensures copied stores include enforced root attrs.
src/h5ad/core/subset.py Creates optional empty AnnData groups in subset outputs to be more concat-friendly.
tests/test_storage_root_attrs.py New tests covering read-mode warning and writable-mode enforcement for root attrs.
tests/test_subset.py Adds integration test asserting optional empty groups exist in subset outputs; minor formatting tweaks.
uv.lock Updates locked editable project version to 0.3.1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a62f6e0d43

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +47 to +49
def _ensure_optional_anndata_groups(dst: Any) -> None:
for key in ("layers", "obsm", "obsp", "varm", "varp"):
_ensure_group(dst, key)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Set mapping attrs when creating optional AnnData groups

subset_h5ad now always creates layers/obsm/obsp/varm/varp when absent, but these groups are created without any encoding-type/encoding-version metadata. In this codebase’s AnnData element docs, those members are mappings and mappings must carry encoding-type="dict" and encoding-version="0.1.0"; creating the groups without attrs turns previously-valid “absent optional member” outputs into malformed mapping groups, which can break schema-aware downstream readers/concat workflows.

Useful? React with 👍 / 👎.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Claptar Claptar merged commit 315c06f into dev Mar 23, 2026
6 of 12 checks passed
@Claptar Claptar deleted the root_attributes branch March 23, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants