Skip to content

Commit 4751dcd

Browse files
authored
Merge pull request #28 from pythonhealthdatascience/dev
Dev
2 parents 070ff44 + 198511d commit 4751dcd

18 files changed

+509
-171
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
# HDR UK Futures RSE: Testing in Research Workflows
1+
# STARS: Testing in Research Workflows
22

3-
This repository contains code behind our website with materials for the 'Testing in Research Workflows' module on HDR UK's RSE001 <a href='https://hdruklearn.org/courses/course-v1:HDRUK+RSE001+2024' target ='_blank'>Research Software Engineering training course</a>. It was developed as part of the <a href='https://pythonhealthdatascience.github.io/stars/' target='_blank'>STARS project</a>.
3+
This repository contains the code behind the website on **Testing in Research Workflows**, developed as part of the <a href='https://pythonhealthdatascience.github.io/stars/' target='_blank'>STARS project</a>. (Sharing Tools and Artefacts for Reproducible Simulation). The materials help researchers integrate testing practices into their workflows, with guidance on testing in Python and R.
4+
5+
These resources are used in HDR UK's RSE001 <a href='https://hdruklearn.org/courses/course-v1:HDRUK+RSE001+2024' target ='_blank'>Research Software Engineering training course</a> ('Testing in Research Workflows' module), but are also intended as a reusable, open educational resource for the wider research software community.
46

57
You can view the website here: <https://pythonhealthdatascience.github.io/stars-testing-intro/>
68

_quarto.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ website:
2828
- pages/parametrising_tests.qmd
2929
- section: Types of test
3030
contents:
31+
- pages/smoke_tests.qmd
3132
- pages/unit_tests.qmd
3233
- pages/functional_tests.qmd
3334
- pages/back_tests.qmd
@@ -49,4 +50,4 @@ format:
4950
css:
5051
- assets/styles.css
5152
- assets/language-selector.css
52-
include-after-body: assets/language-selector.js
53+
include-after-body: assets/language-selector.js
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Smoke test.
3+
"""
4+
5+
import pandas as pd
6+
7+
from waitingtimes.patient_analysis import (
8+
import_patient_data, calculate_wait_times, summary_stats
9+
)
10+
11+
12+
def test_smoke(tmp_path):
13+
"""Smoke: end-to-end workflow produces the expected final output shape."""
14+
# Create test data
15+
test_data = pd.DataFrame(
16+
{
17+
"PATIENT_ID": ["p1", "p2", "p3"],
18+
"ARRIVAL_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
19+
"ARRIVAL_TIME": ["0800", "0930", "1015"],
20+
"SERVICE_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
21+
"SERVICE_TIME": ["0830", "1000", "1045"],
22+
}
23+
)
24+
25+
# Write test CSV
26+
csv_path = tmp_path / "patients.csv"
27+
test_data.to_csv(csv_path, index=False)
28+
29+
# Run complete workflow
30+
df = import_patient_data(csv_path)
31+
df = calculate_wait_times(df)
32+
stats = summary_stats(df["waittime"])
33+
34+
# Final check
35+
assert stats is not None
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Smoke test
2+
3+
test_that("smoke: end-to-end workflow produces some output", {
4+
# Create small, fast test data
5+
test_data <- data.frame(
6+
PATIENT_ID = c("p1", "p2", "p3"),
7+
ARRIVAL_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
8+
ARRIVAL_TIME = c("0800", "0930", "1015"),
9+
SERVICE_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
10+
SERVICE_TIME = c("0830", "1000", "1045"),
11+
stringsAsFactors = FALSE
12+
)
13+
14+
# Write test CSV to a temporary file
15+
csv_path <- tempfile(fileext = ".csv")
16+
utils::write.csv(test_data, csv_path, row.names = FALSE)
17+
18+
# Run complete workflow
19+
df <- import_patient_data(csv_path)
20+
df <- calculate_wait_times(df)
21+
stats <- summary_stats(df$waittime)
22+
23+
# Final smoke-test check: did we get *any* result?
24+
expect_false(is.null(stats))
25+
})

images/website_screenshot.png

242 KB
Loading

pages/back_tests.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ We will run back tests using the [dataset we introduced for our waiting times ca
2626

2727
::: {.python-content}
2828

29-
On the test page, we need to import:
29+
In our test script, we need to import:
3030

3131
```{python}
3232
#| eval: false
@@ -92,4 +92,4 @@ testthat::test_file(
9292

9393
**Errors:** If you identify an error in your pipeline, you first fix the code and then deliberately update the back test in isolation, so you know the only change in behaviour is the error fix and not something unintended elsewhere.
9494

95-
**Changes over time:** As your research evolves, you may update the workflow (e.g., improve the wait time calculation method) or use more recent datasets. You can keep the old back test running alongside new ones - this verifies that changes to the workflow don't accidentally alter results on historical data, while new back tests validate that updated methods work correctly on current data.
95+
**Changes over time:** As your research evolves, you may update the workflow (e.g., improve the wait time calculation method) or use more recent datasets. You can keep the old back test running alongside new ones - this verifies that changes to the workflow don't accidentally alter results on historical data, while new back tests validate that updated methods work correctly on current data.

pages/break.qmd

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Instead, they now see an average of 0.5 in the output.
6666

6767
* They might not immediately realise that 0.5 means 0.5 hours.
6868

69-
* They might start looking for some other explanation: "Has demand changed?" "Is the data different?"" "Did we filter a different subset?""
69+
* They might start looking for some other explanation: "Has demand changed?" "Is the data different?"" "Did we filter a different subset?"
7070

7171
* They might even present the wrong numbers.
7272

@@ -76,6 +76,18 @@ The code is still "working" in the sense that it runs successfully and produces
7676

7777
This is where tests come in handy! After the change to hours, the functional and back tests will both fail.
7878

79+
::: {.python-content}
80+
81+
{{< video /videos/python_break_test.mp4 >}}
82+
83+
:::
84+
85+
::: {.r-content}
86+
87+
{{< video /videos/r_break_test.mp4 >}}
88+
89+
:::
90+
7991
These failures tell the team something important: the problem is in the code, not in the new data or some hidden change in demand.
8092

81-
Without these tests, this kind of "innocent" change could easily slip through and only be discovered much later.
93+
Without these tests, this kind of "innocent" change could easily slip through and only be discovered much later.

pages/code/test_intro_parametrised__test_summary_stats.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
@pytest.mark.parametrize(
2+
"data, expected_mean, expected_std, expected_ci_lower, expected_ci_upper",
3+
[
4+
# Five value sample with known summary statistics
5+
([1.0, 2.0, 3.0, 4.0, 5.0], 3.0, 1.58, 1.04, 4.96),
6+
# No variation: CI collapse to mean
7+
([5, 5, 5], 5, 0, 5, 5),
8+
],
9+
)
110
def test_summary_stats(
211
data, expected_mean, expected_std, expected_ci_lower, expected_ci_upper
312
):

pages/code/test_smoke__imports.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
import pandas as pd
2+
from waitingtimes.patient_analysis import (
3+
import_patient_data, calculate_wait_times, summary_stats
4+
)
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
test_that("smoke: end-to-end workflow produces some output", {
2+
# Create small, fast test data
3+
test_data <- data.frame(
4+
PATIENT_ID = c("p1", "p2", "p3"),
5+
ARRIVAL_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
6+
ARRIVAL_TIME = c("0800", "0930", "1015"),
7+
SERVICE_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
8+
SERVICE_TIME = c("0830", "1000", "1045"),
9+
stringsAsFactors = FALSE
10+
)
11+
12+
# Write test CSV to a temporary file
13+
csv_path <- tempfile(fileext = ".csv")
14+
utils::write.csv(test_data, csv_path, row.names = FALSE)
15+
16+
# Run complete workflow
17+
df <- import_patient_data(csv_path)
18+
df <- calculate_wait_times(df)
19+
stats <- summary_stats(df$waittime)
20+
21+
# Final smoke-test check: did we get *any* result?
22+
expect_false(is.null(stats))
23+
})

0 commit comments

Comments
 (0)