Merge pull request #28 from pythonhealthdatascience/dev

amyheather · web-flow · commit 4751dcd7d6a5 · 2026-02-25T16:42:39.000Z
Dev
diff --git a/README.md b/README.md
@@ -1,6 +1,8 @@
-# HDR UK Futures RSE: Testing in Research Workflows
+# STARS: Testing in Research Workflows
 
-This repository contains code behind our website with materials for the 'Testing in Research Workflows' module on HDR UK's RSE001 <a href='https://hdruklearn.org/courses/course-v1:HDRUK+RSE001+2024' target ='_blank'>Research Software Engineering training course</a>. It was developed as part of the <a href='https://pythonhealthdatascience.github.io/stars/' target='_blank'>STARS project</a>.
+This repository contains the code behind the website on **Testing in Research Workflows**, developed as part of the <a href='https://pythonhealthdatascience.github.io/stars/' target='_blank'>STARS project</a>. (Sharing Tools and Artefacts for Reproducible Simulation). The materials help researchers integrate testing practices into their workflows, with guidance on testing in Python and R.
+
+These resources are used in HDR UK's RSE001 <a href='https://hdruklearn.org/courses/course-v1:HDRUK+RSE001+2024' target ='_blank'>Research Software Engineering training course</a> ('Testing in Research Workflows' module), but are also intended as a reusable, open educational resource for the wider research software community.
     
 You can view the website here: <https://pythonhealthdatascience.github.io/stars-testing-intro/>
 
diff --git a/_quarto.yml b/_quarto.yml
@@ -28,6 +28,7 @@ website:
           - pages/parametrising_tests.qmd
       - section: Types of test
         contents:
+          - pages/smoke_tests.qmd
           - pages/unit_tests.qmd
           - pages/functional_tests.qmd
           - pages/back_tests.qmd
@@ -49,4 +50,4 @@ format:
     css:
       - assets/styles.css
       - assets/language-selector.css
-    include-after-body: assets/language-selector.js
+    include-after-body: assets/language-selector.js
diff --git a/examples/python_package/tests/test_smoke.py b/examples/python_package/tests/test_smoke.py
@@ -0,0 +1,35 @@
+"""
+Smoke test.
+"""
+
+import pandas as pd
+
+from waitingtimes.patient_analysis import (
+    import_patient_data, calculate_wait_times, summary_stats
+)
+
+
+def test_smoke(tmp_path):
+    """Smoke: end-to-end workflow produces the expected final output shape."""
+    # Create test data
+    test_data = pd.DataFrame(
+        {
+            "PATIENT_ID": ["p1", "p2", "p3"],
+            "ARRIVAL_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
+            "ARRIVAL_TIME": ["0800", "0930", "1015"],
+            "SERVICE_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
+            "SERVICE_TIME": ["0830", "1000", "1045"],
+        }
+    )
+
+    # Write test CSV
+    csv_path = tmp_path / "patients.csv"
+    test_data.to_csv(csv_path, index=False)
+
+    # Run complete workflow
+    df = import_patient_data(csv_path)
+    df = calculate_wait_times(df)
+    stats = summary_stats(df["waittime"])
+
+    # Final check
+    assert stats is not None
diff --git a/examples/r_package/tests/testthat/test_smoke.R b/examples/r_package/tests/testthat/test_smoke.R
@@ -0,0 +1,25 @@
+# Smoke test
+
+test_that("smoke: end-to-end workflow produces some output", {
+  # Create small, fast test data
+  test_data <- data.frame(
+    PATIENT_ID   = c("p1", "p2", "p3"),
+    ARRIVAL_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
+    ARRIVAL_TIME = c("0800", "0930", "1015"),
+    SERVICE_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
+    SERVICE_TIME = c("0830", "1000", "1045"),
+    stringsAsFactors = FALSE
+  )
+
+  # Write test CSV to a temporary file
+  csv_path <- tempfile(fileext = ".csv")
+  utils::write.csv(test_data, csv_path, row.names = FALSE)
+
+  # Run complete workflow
+  df <- import_patient_data(csv_path)
+  df <- calculate_wait_times(df)
+  stats <- summary_stats(df$waittime)
+
+  # Final smoke-test check: did we get *any* result?
+  expect_false(is.null(stats))
+})
diff --git a/images/website_screenshot.png b/images/website_screenshot.png
diff --git a/pages/back_tests.qmd b/pages/back_tests.qmd
@@ -26,7 +26,7 @@ We will run back tests using the [dataset we introduced for our waiting times ca
 
 ::: {.python-content}
 
-On the test page, we need to import:
+In our test script, we need to import:
 
 ```{python}
 #| eval: false
@@ -92,4 +92,4 @@ testthat::test_file(
 
 **Errors:** If you identify an error in your pipeline, you first fix the code and then deliberately update the back test in isolation, so you know the only change in behaviour is the error fix and not something unintended elsewhere.
 
-**Changes over time:** As your research evolves, you may update the workflow (e.g., improve the wait time calculation method) or use more recent datasets. You can keep the old back test running alongside new ones - this verifies that changes to the workflow don't accidentally alter results on historical data, while new back tests validate that updated methods work correctly on current data.
+**Changes over time:** As your research evolves, you may update the workflow (e.g., improve the wait time calculation method) or use more recent datasets. You can keep the old back test running alongside new ones - this verifies that changes to the workflow don't accidentally alter results on historical data, while new back tests validate that updated methods work correctly on current data.
diff --git a/pages/break.qmd b/pages/break.qmd
@@ -66,7 +66,7 @@ Instead, they now see an average of 0.5 in the output.
 
 * They might not immediately realise that 0.5 means 0.5 hours.
 
-* They might start looking for some other explanation: "Has demand changed?" "Is the data different?"" "Did we filter a different subset?""
+* They might start looking for some other explanation: "Has demand changed?" "Is the data different?"" "Did we filter a different subset?"
 
 * They might even present the wrong numbers.
 
@@ -76,6 +76,18 @@ The code is still "working" in the sense that it runs successfully and produces
 
 This is where tests come in handy! After the change to hours, the functional and back tests will both fail.
 
+::: {.python-content}
+
+{{< video /videos/python_break_test.mp4 >}}
+
+:::
+
+::: {.r-content}
+
+{{< video /videos/r_break_test.mp4 >}}
+
+:::
+
 These failures tell the team something important: the problem is in the code, not in the new data or some hidden change in demand.
 
-Without these tests, this kind of "innocent" change could easily slip through and only be discovered much later.
+Without these tests, this kind of "innocent" change could easily slip through and only be discovered much later.
diff --git a/pages/code/test_intro_parametrised__test_summary_stats.py b/pages/code/test_intro_parametrised__test_summary_stats.py
@@ -1,3 +1,12 @@
+@pytest.mark.parametrize(
+    "data, expected_mean, expected_std, expected_ci_lower, expected_ci_upper",
+    [
+        # Five value sample with known summary statistics
+        ([1.0, 2.0, 3.0, 4.0, 5.0], 3.0, 1.58, 1.04, 4.96),
+        # No variation: CI collapse to mean
+        ([5, 5, 5], 5, 0, 5, 5),
+    ],
+)
 def test_summary_stats(
     data, expected_mean, expected_std, expected_ci_lower, expected_ci_upper
 ):
diff --git a/pages/code/test_smoke__imports.py b/pages/code/test_smoke__imports.py
@@ -0,0 +1,4 @@
+import pandas as pd
+from waitingtimes.patient_analysis import (
+    import_patient_data, calculate_wait_times, summary_stats
+)
diff --git a/pages/code/test_smoke__smoke_end_to_end_workflow.R b/pages/code/test_smoke__smoke_end_to_end_workflow.R
@@ -0,0 +1,23 @@
+test_that("smoke: end-to-end workflow produces some output", {
+  # Create small, fast test data
+  test_data <- data.frame(
+    PATIENT_ID   = c("p1", "p2", "p3"),
+    ARRIVAL_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
+    ARRIVAL_TIME = c("0800", "0930", "1015"),
+    SERVICE_DATE = c("2024-01-01", "2024-01-01", "2024-01-02"),
+    SERVICE_TIME = c("0830", "1000", "1045"),
+    stringsAsFactors = FALSE
+  )
+
+  # Write test CSV to a temporary file
+  csv_path <- tempfile(fileext = ".csv")
+  utils::write.csv(test_data, csv_path, row.names = FALSE)
+
+  # Run complete workflow
+  df <- import_patient_data(csv_path)
+  df <- calculate_wait_times(df)
+  stats <- summary_stats(df$waittime)
+
+  # Final smoke-test check: did we get *any* result?
+  expect_false(is.null(stats))
+})
diff --git a/pages/code/test_smoke__test_smoke.py b/pages/code/test_smoke__test_smoke.py
@@ -0,0 +1,24 @@
+def test_smoke(tmp_path):
+    """Smoke: end-to-end workflow produces the expected final output shape."""
+    # Create test data
+    test_data = pd.DataFrame(
+        {
+            "PATIENT_ID": ["p1", "p2", "p3"],
+            "ARRIVAL_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
+            "ARRIVAL_TIME": ["0800", "0930", "1015"],
+            "SERVICE_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
+            "SERVICE_TIME": ["0830", "1000", "1045"],
+        }
+    )
+
+    # Write test CSV
+    csv_path = tmp_path / "patients.csv"
+    test_data.to_csv(csv_path, index=False)
+
+    # Run complete workflow
+    df = import_patient_data(csv_path)
+    df = calculate_wait_times(df)
+    stats = summary_stats(df["waittime"])
+
+    # Final check
+    assert stats is not None
diff --git a/pages/code/test_unit__test_import_errors.py b/pages/code/test_unit__test_import_errors.py
@@ -1,3 +1,22 @@
+@pytest.mark.parametrize(
+    "columns",
+    [
+        # Example 1: Missing columns
+        [
+            "PATIENT_ID", "ARRIVAL_DATE", "ARRIVAL_TIME", "SERVICE_DATE"
+        ],
+        # Example 2: Extra columns
+        [
+            "PATIENT_ID", "ARRIVAL_DATE", "ARRIVAL_TIME",
+            "SERVICE_DATE", "SERVICE_TIME", "EXTRA",
+        ],
+        # Example 3: Right columns, wrong order
+        [
+            "ARRIVAL_DATE", "PATIENT_ID", "ARRIVAL_TIME",
+            "SERVICE_DATE", "SERVICE_TIME",
+        ],
+    ],
+)
 def test_import_errors(tmp_path, columns):
     """Incorrect columns should trigger ValueError."""
 
diff --git a/pages/smoke_tests.qmd b/pages/smoke_tests.qmd
@@ -0,0 +1,159 @@
+---
+title: "Smoke tests"
+---
+
+```{r}
+#| include: false
+library(reticulate)
+use_condaenv("stars-testing-intro", required = TRUE)
+```
+
+{{< include /assets/language-selector.html >}}
+
+## What is a smoke test?
+
+**Smoke tests** (also known as **build verification tests**) are a "sanity check" run before a complete test suite. They are extremely quick and just test that the code can run end-to-end, and not that the results are correct.
+
+If a smoke test fails, it usually means something fundamental is broken (for example, a missing dependency, a changed column name, or a syntax error). In that case, there is no point running slower, more detailed tests until the basic problem is fixed.
+
+## Example: waiting times case study
+
+We will create a simple smoke test for our case study workflow. This workflow uses three functions: 
+
+* `import_patient_data()` - reads patient data from CSV.
+* `calculate_wait_times()` - calculates wait times.
+* `summary_stats()` - produces summary statistics.
+
+For the smoke test, we do not care whether the CSV, wait times or statistics are correct. We only care that each function is able to run successfully, and that at least something is returned.
+
+::: {.python-content}
+
+We will need the following imports in our test script:
+
+```{python}
+#| eval: false
+#| file: code/test_smoke__imports.py
+```
+
+:::
+
+## Smoke test
+
+In the test, we build a **tiny dummy dataset** so that the test runs quickly.
+
+:::: {.python-content}
+
+We use pytest's `tmp_path` fixture, which gives us a **temporary folder** that is created for the test and automatically cleaned up afterwards, so we do not touch any real files on your machine.
+
+We then run the full workflow end-to-end, and finish with a minimal assertion: checking only that `stats` exists.
+
+```{python}
+#| eval: false
+#| file: code/test_smoke__test_smoke.py
+```
+
+::: {.callout-tip title="What is a fixture?"}
+
+In pytest, a **fixture** is a small helper function that prepares something your test needs and then passes it into the test function as a parameter.
+
+When you write a test like `def test_name(tmp_path):`, the name `tmp_path` tells pytest to call its built-in `tmp_path` fixture first. Whatever that fixture returns is then given to the test as the `tmp_path` argument.
+
+In this case, the fixture creates a temporary folder and passes a path object called `tmp_path` into the test. At the end of the test, pytest removes that folder, so you do not leave any files behind.
+
+:::
+
+::::
+
+::: {.r-content}
+
+We write it to a temporary file using `tempfile()`, so we do not touch any real files on your machine.
+
+We then run the full workflow end-to-end, and finish with a minimal expectation: checking only that `stats` exists.
+
+```{r}
+#| eval: false
+#| file: code/test_smoke__smoke_end_to_end_workflow.R
+```
+
+:::
+
+## How to run the smoke test
+
+You can run only the smoke test file with:
+
+::: {.python-content}
+
+```{.bash}
+pytest test_smoke.py
+```
+
+However, you may wish to treat the smoke test as a **gate** - i.e., if that test fails, no others are run. So, instead of just calling `pytest` when running your test suite, you can call:
+
+```{.bash}
+pytest test_smoke.py && pytest --ignore=test_smoke.py
+```
+
+:::
+
+::: {.r-content}
+
+```{.r}
+testthat::test_file("tests/testthat/test_smoke.R")
+```
+
+However, you may wish to treat the smoke test as a **gate** - i.e., if that test fails, no others are run. So, instead of just calling `devtools::test()` when running your test suite, you can create a `Makefile`:
+
+```{.bash}
+test:
+	Rscript -e "testthat::test_file('tests/testthat/test_smoke.R')" && \
+	Rscript -e "devtools::test()"
+```
+
+This is then run by calling:
+
+```{.bash}
+make test
+```
+
+:::
+
+This works as follows:
+
+* The first command runs the smoke test.
+* `&&` means "only run the next command if the previous one succeeded".
+* The second command runs the full test suite.
+
+If the smoke test fails, the second command is never run, so the full test suite is not executed.
+
+## Running our example test
+
+:::: {.callout-note title="Test output"}
+
+::: {.python-content}
+
+```{python}
+#| echo: false
+import pytest
+pytest.main(["../examples/python_package/tests/test_smoke.py"])
+```
+
+:::
+
+::: {.r-content}
+
+```{r}
+#| echo: false
+#| output: false
+devtools::load_all("../examples/r_package")
+```
+
+```{r}
+#| echo: false
+testthat::test_file(
+  "../examples/r_package/tests/testthat/test_smoke.R"
+)
+```
+
+:::
+
+::::
diff --git a/renv.lock b/renv.lock
diff --git a/tools/extract_rsnippets.py b/tools/extract_rsnippets.py
diff --git a/tools/extract_snippets.py b/tools/extract_snippets.py
diff --git a/videos/python_break_test.mp4 b/videos/python_break_test.mp4
diff --git a/videos/r_break_test.mp4 b/videos/r_break_test.mp4