Adding unit_testing examples from presentation by Arun-bioinformatics · Pull Request #19 · eastgenomics/code_school

Arun-bioinformatics · 2026-02-23T17:07:02Z

Updated the main README.md to specify the inclusion of the new content
Added a second README.md inside the Unit_testing_python_2026 with some explanation on how to run pytest.

This change is

Summary by CodeRabbit

Documentation
- Added a "2026 additions" section and expanded learning-session README with a unit testing presentation, usage instructions, prerequisites and test commands.
New Features
- Added utilities for ClinVar significance lookup, DNA GC-content calculation and BED file length computation.
Tests
- Added comprehensive unit tests covering normal cases, edge cases and error handling for all new utilities.

- Updated the main README.md to specify the inclusion of the new content - Added a second README.md inside the Unit_testing_python_2026 with some explanation on how to run pytest.

coderabbitai · 2026-02-23T17:07:24Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 41a6b0b and 1af58e4.

📒 Files selected for processing (1)

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py

Walkthrough

Adds a Unit Testing learning session: three new utility modules (GC content, BED length, ClinVar lookup), READMEs updated with instructions, and pytest suites covering normal, edge and error cases including mocks and file fixtures.

Changes

Cohort / File(s)	Summary
Documentation `README.md`, `learning-sessions/Unit_testing_python_2026/example_code/README.md`	Inserted "2026 additions" and a Codeschool presentation README describing purpose, prerequisites, and commands to run `pytest` and `pytest-cov`.
Utility modules `learning-sessions/Unit_testing_python_2026/example_code/bin/script.py`, `learning-sessions/Unit_testing_python_2026/example_code/bin/script_bed.py`, `learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py`	Added `calculate_gc_content(sequence)` with validation and rounding, `calculate_total_bed_length(filepath)` reading BED with strict validation and explicit errors, and `get_clinvar_significance(variation_id)` calling ClinVar API, handling HTTP and JSON errors.
Tests `learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py`, `learning-sessions/Unit_testing_python_2026/example_code/tests/test_script_bed.py`, `learning-sessions/Unit_testing_python_2026/example_code/tests/test_clinvar_utils.py`	Added pytest modules covering expected results, edge cases, and exception paths; includes fixtures, file-based BED fixtures, and mocking for external API responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the primary objective: adding unit testing examples from a presentation, which is clearly demonstrated by the new learning-sessions directory and comprehensive test files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch Unit_testing_codeschool_2026

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (1)

learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py (1)

39-41: Consider moving return to the else block for clarity.

Having return significance inside the try body subtly conflates the success path with the error path. Moving it to an else block makes the intent explicit — which is a useful teaching point for the audience.

♻️ Proposed refactor

     try:
         significance = data['clinical_significance']['description']
-        return significance
     except KeyError as exc:
         raise ValueError(
             f"Could not find significance data for ID {variation_id}") from exc
+    else:
+        return significance

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py`
around lines 39 - 41, The try block in the function that reads clinical
significance (the code setting significance =
data['clinical_significance']['description']) currently returns inside the try;
change this to return in an else block after the try/except so the success path
is explicit: keep the same try to extract significance, keep the except handling
as-is, and add an else: return significance after the try/except (refer to the
significance variable and the extraction expression
data['clinical_significance']['description'] to locate the code).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@learning-sessions/Unit_testing_python_2026/example_code/README.md`:
- Line 5: The README instructs users to run pytest in `unit_test_example_code`
which doesn't exist; either update the text to reference the actual folder
`example_code` or rename the directory to `unit_test_example_code`. Edit the
README.md text (the sentence that mentions `unit_test_example_code`) to use
`example_code` (or change the directory name to match the README) so the command
`pytest` runs from the correct folder; ensure any other README references are
consistent with the `example_code` directory name.
- Around line 1-9: The README is missing the pytest-mock dependency needed by
tests/test_clinvar_utils.py which uses the mocker fixture; update the docs to
declare pytest-mock as a test/dev dependency and show how to install it (e.g.,
pip install pytest-mock) or add it to your project's
requirements/dev-requirements, and mention that pytest should be run after
installing this dependency (the pytest and pytest-cov instructions should remain
but note pytest-mock is required for the mocker fixture to work).

In
`@learning-sessions/Unit_testing_python_2026/example_code/tests/test_clinvar_utils.py`:
- Around line 24-31: The test parameter name is wrong: test_significance_logic
currently requests a nonexistent fixture `_clinvar_mock` so pytest will error;
rename the test parameter to the registered fixture name `clinvar_mock` so the
fixture (declared with `@pytest.fixture`(name="clinvar_mock")) is applied and the
call to get_clinvar_significance('67890') runs as intended.
- Line 1: The module docstring in tests/test_clinvar_utils.py incorrectly
references "calculate_gc_content"; update the test file docstring to reference
the actual function under test by replacing "calculate_gc_content function" with
"get_clinvar_significance function" so the docstring correctly describes the
tests for get_clinvar_significance in bin/clinvar_utils.py.

In
`@learning-sessions/Unit_testing_python_2026/example_code/tests/test_script_bed.py`:
- Line 1: The module docstring in the test file incorrectly references
calculate_gc_content; update the top-line docstring to mention
calculate_total_bed_length instead. Edit the string literal at the top of
tests/test_script_bed.py to read something like "Unit tests for the
calculate_total_bed_length function in bin/script_bed.py" so the docstring
accurately reflects the tested function (calculate_total_bed_length).
- Around line 6-11: The fixture docstring for valid_bed_factory claims "3-line"
but the written file has only 2 lines; update the docstring in the pytest
fixture valid_bed (function valid_bed_factory) to accurately state "2-line" (or
"two-line") to match the content written to f ("valid.bed"), or alternatively
add a third BED line to f.write_text if you intended three lines—ensure the
docstring and the actual data produced by valid_bed_factory stay consistent.

In
`@learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py`:
- Around line 42-44: The regex in the pytest.raises match contains unescaped
dots which act as wildcards; update the test in the with pytest.raises(...)
block that calls calculate_gc_content("ATGCX") so the expected message uses an
escaped-dot pattern (e.g., use a raw string with \. for each period) or simplify
to a partial, literal-safe substring (e.g., "Sequence contains invalid
characters" or "Only A, T, G, C are allowed") to avoid unintended matches;
ensure the change targets the pytest.raises(..., match=...) argument in
tests/test_script.py.

---

Nitpick comments:
In
`@learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py`:
- Around line 39-41: The try block in the function that reads clinical
significance (the code setting significance =
data['clinical_significance']['description']) currently returns inside the try;
change this to return in an else block after the try/except so the success path
is explicit: keep the same try to extract significance, keep the except handling
as-is, and add an else: return significance after the try/except (refer to the
significance variable and the extraction expression
data['clinical_significance']['description'] to locate the code).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5fb742a and 564fcaa.

📒 Files selected for processing (10)

README.md
learning-sessions/Unit_testing_python_2026/example_code/README.md
learning-sessions/Unit_testing_python_2026/example_code/bin/__init__.py
learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py
learning-sessions/Unit_testing_python_2026/example_code/bin/script.py
learning-sessions/Unit_testing_python_2026/example_code/bin/script_bed.py
learning-sessions/Unit_testing_python_2026/example_code/tests/__init__.py
learning-sessions/Unit_testing_python_2026/example_code/tests/test_clinvar_utils.py
learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py
learning-sessions/Unit_testing_python_2026/example_code/tests/test_script_bed.py

coderabbitai · 2026-02-23T17:15:12Z

learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py

+    url = ("https://api.ncbi.nlm.nih.gov/variation/v0/beta/"
+           f"clinical-significance/variation/{variation_id}")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

NCBI Variation Services API v0 beta clinical-significance endpoint status 2025

💡 Result:

In 2025, NCBI’s Variation Services API v0 was still publicly available (e.g., called out alongside dbSNP Build 157 release products) and still described publicly as “beta testing mode.” [1], [2]

For the “v0 beta” RefSNP route, NCBI appears to have been moving away from the /beta/ path: requests to …/variation/v0/beta/refsnp/{rsid} redirect to the non-beta route …/variation/v0/refsnp/{rsid}/. [3]

Also, there isn’t a distinct, standalone “clinical-significance endpoint” in v0 that you call directly; rather, clinical significance is provided as fields inside RefSNP/allele-style responses (i.e., you fetch a RefSNP record and read its clinical-significance-related fields). [4]

Sources:
[1] dbSNP Build 157 Release (NCBI Insights, 2025-03-18) (ncbiinsights.ncbi.nlm.nih.gov)
[2] SPDI Variation Service listing (catalog.data.gov; metadata updated 2025-06-19; notes “still in beta testing mode”) (catalog.data.gov)
[3] Live API behavior: /variation/v0/beta/refsnp/8127500 redirects to /variation/v0/refsnp/8127500/ (api.ncbi.nlm.nih.gov)
[4] Example of calling the RefSNP API (/variation/v0/refsnp/{rsid}) rather than a clinical-significance-specific endpoint (github-wiki-see.page)

Citations:

1: https://ncbiinsights.ncbi.nlm.nih.gov/2025/03/18/dbsnp-release-157/?utm_source=openai

2: https://catalog.data.gov/dataset/spdi-variation-service?utm_source=openai

3: https://api.ncbi.nlm.nih.gov/variation/v0/beta/refsnp/8127500

4: https://github-wiki-see.page/m/USF-HII/snptk/wiki/dbSNPJson?utm_source=openai

Update the endpoint structure; clinical-significance data is embedded in RefSNP responses, not accessed via a standalone endpoint.

The /beta/ path is deprecated—NCBI redirects /variation/v0/beta/refsnp/ requests to /variation/v0/refsnp/. More fundamentally, there's no standalone clinical-significance/variation/ endpoint in v0. Clinical significance is returned as a field within RefSNP or allele responses. The code should fetch the relevant RefSNP or allele record and extract clinical-significance data from that response instead.

It doesn't really matter whether the function works or not. This is just supposed to be an example for unit testing code.

@Arun-bioinformatics, understood—since this is example code for demonstrating unit testing principles, the API endpoint accuracy isn't critical. Thanks for clarifying!

✏️ Learnings added

Learnt from: Arun-bioinformatics Repo: eastgenomics/code_school PR: 19 File: learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py:27-28 Timestamp: 2026-02-26T09:59:34.602Z Learning: In the eastgenomics/code_school repository, code within the `learning-sessions/` directory is educational material intended to demonstrate concepts (e.g., unit testing), not production-ready code. Functional correctness of external API calls or similar implementation details may not be the primary focus.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

learning-sessions/Unit_testing_python_2026/example_code/README.md

learning-sessions/Unit_testing_python_2026/example_code/tests/test_clinvar_utils.py

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script_bed.py

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py

Arun-bioinformatics

@Arun-bioinformatics made 2 comments.
Reviewable status: 0 of 10 files reviewed, 2 unresolved discussions (waiting on RSWilson1).

Arun-bioinformatics · 2026-02-26T09:59:02Z

learning-sessions/Unit_testing_python_2026/example_code/bin/clinvar_utils.py

+    url = ("https://api.ncbi.nlm.nih.gov/variation/v0/beta/"
+           f"clinical-significance/variation/{variation_id}")


It doesn't really matter whether the function works or not. This is just supposed to be an example for unit testing code.

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py

Arun-bioinformatics

Nitpick comment addressed

@Arun-bioinformatics made 1 comment.
Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on RSWilson1).

Arun-bioinformatics

@Arun-bioinformatics made 1 comment.
Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on RSWilson1).

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py

Adding unit_testing examples from presentation

564fcaa

- Updated the main README.md to specify the inclusion of the new content - Added a second README.md inside the Unit_testing_python_2026 with some explanation on how to run pytest.

Arun-bioinformatics requested a review from RSWilson1 February 23, 2026 17:07

coderabbitai bot reviewed Feb 23, 2026

View reviewed changes

Address PEP8 and coderabbit issues.

41a6b0b

Arun-bioinformatics commented Feb 26, 2026

View reviewed changes

One final coderabbit comment to address

1af58e4

Arun-bioinformatics commented Feb 26, 2026

View reviewed changes

learning-sessions/Unit_testing_python_2026/example_code/tests/test_script.py Outdated Show resolved Hide resolved

		url = ("https://api.ncbi.nlm.nih.gov/variation/v0/beta/"
		f"clinical-significance/variation/{variation_id}")

Conversation

Arun-bioinformatics commented Feb 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Arun-bioinformatics Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Arun-bioinformatics left a comment

Choose a reason for hiding this comment

Uh oh!

Arun-bioinformatics Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Arun-bioinformatics left a comment

Choose a reason for hiding this comment

Uh oh!

Arun-bioinformatics left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Arun-bioinformatics commented Feb 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 23, 2026 •

edited

Loading

coderabbitai bot Feb 23, 2026 •

edited

Loading