Skip to content

tests: remove_zero_variance_vars and remove_contaminants#25

Open
leventetn wants to merge 2 commits intomainfrom
tests/pp.remove_contaminants
Open

tests: remove_zero_variance_vars and remove_contaminants#25
leventetn wants to merge 2 commits intomainfrom
tests/pp.remove_contaminants

Conversation

@leventetn
Copy link
Copy Markdown
Collaborator

No description provided.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community bot commented Mar 24, 2026

Documentation build overview

📚 proteopy | 🛠️ Build #32144651 | 📁 Comparing f5c0775 against latest (6962ed7)

  🔍 Preview build  

Show files changed (39 files in total): 📝 33 modified | ➕ 3 added | ➖ 3 deleted
File Status
changelog.html 📝 modified
genindex.html 📝 modified
index.html 📝 modified
references.html 📝 modified
api/datasets.html 📝 modified
api/pl.html 📝 modified
api/pp.html 📝 modified
api/read.html 📝 modified
api/tl.html 📝 modified
api/utils.html 📝 modified
news/2026-03-31_preprint.html ➖ deleted
news/index.html 📝 modified
tutorials/bludau-2021_tissue-specific-proteoform-inference-across-five-mouse-organs.html ➖ deleted
tutorials/index.html 📝 modified
tutorials/karayel-2020_proteome-remodeling-during-human-erythropoiesis.html ➖ deleted
tutorials/proteoform-inference_bludau-2021_reproduction.html ➕ added
tutorials/workflow_protein-analysis_karayel-2020.html ➕ added
api/generated/proteopy.pl.completeness.html ➕ added
api/generated/proteopy.pl.completeness_per_sample.html 📝 modified
api/generated/proteopy.pl.completeness_per_var.html 📝 modified
api/generated/proteopy.pl.cv_by_group.html 📝 modified
api/generated/proteopy.pl.n_cat1_per_cat2_hist.html 📝 modified
api/generated/proteopy.pl.n_peptides_per_protein.html 📝 modified
api/generated/proteopy.pl.n_peptides_per_sample.html 📝 modified
api/generated/proteopy.pl.n_proteins_per_sample.html 📝 modified
api/generated/proteopy.pl.n_proteoforms_per_protein.html 📝 modified
api/generated/proteopy.pl.proteoform_scores.html 📝 modified
api/generated/proteopy.pl.sample_correlation_matrix.html 📝 modified
api/generated/proteopy.pp.impute_downshift.html 📝 modified
api/generated/proteopy.read.diann.html 📝 modified
api/generated/proteopy.utils.is_log_transformed.html 📝 modified
api/manual/maxquant.html 📝 modified
_modules/proteopy/pl/copf.html 📝 modified
_modules/proteopy/pl/stat_tests.html 📝 modified
_modules/proteopy/pl/stats.html 📝 modified
_modules/proteopy/pp/imputation.html 📝 modified
_modules/proteopy/read/diann.html 📝 modified
_modules/proteopy/utils/array.html 📝 modified
_modules/proteopy/utils/stat_tests.html 📝 modified

@idf-io idf-io force-pushed the tests/pp.remove_contaminants branch from 4c32aa6 to f5c0775 Compare April 7, 2026 06:40
@idf-io
Copy link
Copy Markdown
Collaborator

idf-io commented Apr 10, 2026

General

  • Put new file paths into .github/workflows/format-code_perform-tests_on_push-pr.yaml for linting

remove_zero_variance_vars

@leventetn please review

Missing things

  1. CRITICAL:
  • Include tests for single sample anndata
  • Test atol for var >, == and < situations
  • Readability: Include comment about what the var in relevant vars is when non-obvious if it is >, == or < than atol.
  1. Recommended:
  • Assert result is not None when inplace == False (e.g. line 1283)
  • Add adata.var.empty() == True assertion when expected dataframe has no data (e.g. 1303)
  1. Optional:
  • Avoid _make_XXX helpers place in beginning of file when only called once. Place before being used.

with pytest.warns(UserWarning, match=r"at least one group"):
filtered = remove_zero_variance_vars(
adata, group_by="group", inplace=False,
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check for remaining vars

adata, group_by="group", inplace=False,
)
assert "peptide_id" in filtered.var.columns
assert "protein_id" in filtered.var.columns
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: PEP8

Image


# ── G. Data type variants ──────────────────────────────────────

def test_float32_matrix(self):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the reason behind this test? @leventetn
Remember, you are responsible for every line of code you write.

protein_key="uniprot_id",
inplace=False,
)
assert list(filtered.var_names) == ["protein_0", "protein_3", "protein_4"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a more robust test you need to use "uniprot_id"s that don't match the adata.var_names in both name and order. In the current version of the test could have the bug that if the function remove_contaminants uses the adata.var_names even if an alternative protein_key is supplied, the test would still pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants