Conversation
…empty groups are included in the subset output
There was a problem hiding this comment.
Pull request overview
This PR strengthens AnnData schema compliance by enforcing required root-level encoding attributes on writable store opens/copies, while keeping read-mode tolerant via warnings, and makes subset outputs more compatible with downstream concat workflows by always materializing optional empty AnnData groups.
Changes:
- Enforce
encoding-type/encoding-versionroot attributes on writableopen_store(...)and incopy_store_contents(...); emit a warning (not an error) on read when missing/invalid. - Ensure subset outputs always contain optional empty AnnData groups:
layers,obsm,obsp,varm,varp. - Add regression tests for root-attr enforcement/warnings and optional-group creation in subset output; bump locked package version.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/h5ad/storage/__init__.py |
Adds root AnnData attr validation/enforcement and warning behavior; ensures copied stores include enforced root attrs. |
src/h5ad/core/subset.py |
Creates optional empty AnnData groups in subset outputs to be more concat-friendly. |
tests/test_storage_root_attrs.py |
New tests covering read-mode warning and writable-mode enforcement for root attrs. |
tests/test_subset.py |
Adds integration test asserting optional empty groups exist in subset outputs; minor formatting tweaks. |
uv.lock |
Updates locked editable project version to 0.3.1. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a62f6e0d43
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def _ensure_optional_anndata_groups(dst: Any) -> None: | ||
| for key in ("layers", "obsm", "obsp", "varm", "varp"): | ||
| _ensure_group(dst, key) |
There was a problem hiding this comment.
Set mapping attrs when creating optional AnnData groups
subset_h5ad now always creates layers/obsm/obsp/varm/varp when absent, but these groups are created without any encoding-type/encoding-version metadata. In this codebase’s AnnData element docs, those members are mappings and mappings must carry encoding-type="dict" and encoding-version="0.1.0"; creating the groups without attrs turns previously-valid “absent optional member” outputs into malformed mapping groups, which can break schema-aware downstream readers/concat workflows.
Useful? React with 👍 / 👎.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
PR: Enforce AnnData root attrs and make subset outputs concat-friendly
Title
Enforce AnnData root attrs and make subset outputs concat-friendly
Summary
This PR updates subset/store behavior to improve AnnData schema compliance and downstream compatibility with disk-based concat workflows.
layers,obsm,obsp,varm,varp) so generated files are structurally closer to full AnnData outputs.Files Changed
src/h5ad/storage/__init__.pysrc/h5ad/core/subset.pytests/test_storage_root_attrs.pytests/test_subset.pyuv.lockValidation
Targeted tests run:
Both passed.
Commits in this PR (
dev..root_attributes)