perf(score_genes): avoid copy-heavy sparse nan mean path by SID-6921 · Pull Request #4159 · scverse/scanpy

SID-6921 · 2026-06-14T07:44:20Z

Problem: _sparse_nanmean currently makes sparse copies and eliminate_zeros calls.
Change: aggregate sums and NaN counts directly via compressed index pointers (csr/csc), no matrix copies.
Correctness: preserves np.nanmean-equivalent behavior for sparse matrices.
Tests: expanded test_sparse_nanmean to run on both csr and csc formats.
Validation command: ANNDATA_ZARR_WRITE_FORMAT=3 python -m pytest tests/test_score_genes.py -k sparse_nanmean -q
Closes _sparse_nanmean is inefficient #1894

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR optimizes _sparse_nanmean() for compressed sparse matrices and expands unit tests to validate behavior across sparse storage formats.

Changes:

Reworked _sparse_nanmean() to avoid sparse matrix copies/eliminate_zeros() and compute reductions via indptr-based aggregation.
Added explicit runtime validation for axis values.
Extended tests to run _sparse_nanmean() against both CSR and CSC inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
tests/test_score_genes.py	Parameterizes the test to exercise both CSR and CSC matrix formats.
src/scanpy/tools/_score_genes.py	Replaces copy-heavy NaN-mean computation with a pointer-based aggregation approach and adds axis validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    with np.errstate(invalid="ignore", divide="ignore"):
+        return sums / counts


+    segment_ids = np.repeat(np.arange(out_size), segment_lengths)
+    isnan = np.isnan(mat.data)
+
+    sums = np.bincount(
+        segment_ids[~isnan],
+        weights=mat.data[~isnan],
+        minlength=out_size,
+    ).astype(np.float64, copy=False)
+    nan_counts = np.bincount(segment_ids[isnan], minlength=out_size)


+    if axis not in (0, 1):
+        msg = "axis must be 0 or 1"
+        raise ValueError(msg)


SID-6921 · 2026-06-14T07:46:53Z

Maintainers: this PR is failing the metadata gate because I cannot apply labels from a fork. Could you please add the
o milestone\ label (or assign a milestone)? The code checks pass locally.

codecov · 2026-06-14T07:52:11Z

Codecov Report

❌ Patch coverage is 85.71429% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.60%. Comparing base (2ae768e) to head (0a84825).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/scanpy/tools/_score_genes.py	85.71%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4159      +/-   ##
==========================================
- Coverage   79.61%   79.60%   -0.02%     
==========================================
  Files         120      120              
  Lines       12786    12790       +4     
==========================================
+ Hits        10180    10181       +1     
- Misses       2606     2609       +3

Flag	Coverage Δ
hatch-test.low-vers	`78.85% <85.71%> (+<0.01%)`	⬆️
hatch-test.pre	`79.46% <85.71%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/scanpy/tools/_score_genes.py	`86.20% <85.71%> (-1.30%)`	⬇️

... and 2 files with indirect coverage changes

perf(score_genes): avoid sparse copies in _sparse_nanmean

0a84825

Copilot AI review requested due to automatic review settings June 14, 2026 07:44

Copilot AI reviewed Jun 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(score_genes): avoid copy-heavy sparse nan mean path#4159

perf(score_genes): avoid copy-heavy sparse nan mean path#4159
SID-6921 wants to merge 1 commit into
scverse:mainfrom
SID-6921:perf/sparse-nanmean-no-copy

SID-6921 commented Jun 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

SID-6921 commented Jun 14, 2026

Uh oh!

codecov Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		with np.errstate(invalid="ignore", divide="ignore"):
		return sums / counts

Conversation

SID-6921 commented Jun 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

SID-6921 commented Jun 14, 2026

Uh oh!

codecov Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Jun 14, 2026 •

edited

Loading