Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
6b44d2f
Initial plan
Copilot Jan 9, 2026
368fb4d
Add initial Stan implementation with model files and R functions
Copilot Jan 9, 2026
5d88e45
Add examples, fix linting issues, and update WORDLIST for Stan functions
Copilot Jan 9, 2026
530fc7f
Update README and copilot instructions to document Stan support
Copilot Jan 9, 2026
e973b69
Add *.knit.* to .gitignore and install cmdstanr in copilot-setup-step…
Copilot Jan 21, 2026
d5ad3ce
Wrap run_mod_stan examples in \dontrun{} to fix R CMD check failure
Copilot Jan 27, 2026
fc3a564
Merge branch 'main' into copilot/convert-jags-model-to-stan
Copilot May 7, 2026
1a91c47
Merge branch 'main' into copilot/convert-jags-model-to-stan
d-morrison May 7, 2026
d71bb31
Refactor to extract shared code between JAGS and Stan functions
Copilot May 7, 2026
0fe0ef5
Fix review comments: draws indexing, stan_fit storage, NA handling, n…
Copilot May 7, 2026
28e7a9d
Fix review comments: README list formatting, tibble::as_tibble, facto…
Copilot May 7, 2026
deb065a
Fix review comments: NA checking, prior validation, example format, b…
Copilot May 7, 2026
cf5d7f6
Fix review comments: add badger to Suggests, handle NA strat levels, …
Copilot May 8, 2026
d8599f8
Fix open review comments: update install URL, fix factor indexing, cl…
Copilot May 8, 2026
c1f61ad
Add comprehensive tests for Stan functions (prep_data_stan, prep_prio…
Copilot May 8, 2026
18b91e6
Fix CI failures: lint issues, test warnings, and pkgdown reference index
Copilot May 8, 2026
6bfa6ba
Add explicit pre-review validation requirements to copilot-instructio…
Copilot May 8, 2026
a31a95c
Fix review comments: remove nburn/nmc from tests, fix add_newperson d…
Copilot May 8, 2026
e8ceafb
Remove add_newperson parameter from prep_data_stan, add sample_predic…
Copilot May 9, 2026
951468c
Fix review comments: stratification validation, calc_fit_mod per-stra…
Copilot May 9, 2026
b1e4701
Fix review comments: rewrite sample_predictive_stan, fix docs, remove…
Copilot May 11, 2026
2c9e7c3
Address code review feedback: clarify Subnum indexing and rename popu…
Copilot May 11, 2026
2798684
Fix linting errors: line length issues in prep_priors.R and test-stan…
Copilot May 11, 2026
ff8441d
Fix review comments: remove duplicate n_params, use population-level …
Copilot May 12, 2026
3db4454
Fix mu_par indexing, add empty data validation, add multi-antigen test
Copilot May 12, 2026
cd36771
Fix sample_predictive_stan: transform mu_par from log scale, compute …
Copilot May 12, 2026
01af5c0
Changes before error encountered
Copilot May 12, 2026
3730197
Use markdown syntax instead of Rd syntax in roxygen2 documentation
Copilot May 12, 2026
b40d15a
Address review comments: NA validation, antigen ordering, markdown sy…
Copilot May 12, 2026
4360bad
Fix antigen attribute access: use plain vector, not $Iso_type
Copilot May 12, 2026
f0320a5
Add NA stratification validation and relax nchain constraint
Copilot May 13, 2026
b5c4af4
Fix misleading error message for empty stratification list
Copilot May 13, 2026
a28ba17
Add validation for max_antigens and CmdStan installation check
Copilot May 13, 2026
6a60ad4
Update documentation for with_post parameter and sample_predictive_stan
Copilot May 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 74 additions & 11 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,17 @@

## Repository Overview

**serodynamics** is an R package for modeling longitudinal antibody responses to infection. It implements Bayesian MCMC methods using JAGS (Just Another Gibbs Sampler) to estimate antibody dynamic curve parameters including baseline concentration, peak concentration, time to peak, shape parameter, and decay rate.
**serodynamics** is an R package for modeling longitudinal antibody responses to infection. It implements Bayesian MCMC methods to estimate antibody dynamic curve parameters including baseline concentration, peak concentration, time to peak, shape parameter, and decay rate.

The package supports two Bayesian modeling backends:
- **JAGS** (Just Another Gibbs Sampler) via `runjags` - the original implementation
- **Stan** via `cmdstanr` - a modern, efficient alternative (optional)

- **Type**: R package (statistical modeling)
- **Size**: ~121MB, ~209 files, ~76 R source files, ~1,664 lines of R code
- **Language**: R (>= 4.1.0)
- **Key Dependencies**: runjags, rjags, JAGS 4.3.1, serocalculator, ggmcmc, dplyr, ggplot2
- **Optional Dependencies**: cmdstanr (for Stan support)
- **Lifecycle**: Experimental (under active development)

## Critical Setup Requirements
Expand Down Expand Up @@ -213,9 +218,9 @@ devtools::check_man()

- **Windows**: Install Rtools from https://cran.r-project.org/bin/windows/Rtools/ (choose version matching your R version)

### JAGS Installation (REQUIRED)
### JAGS Installation (REQUIRED for JAGS models)

**ALWAYS install JAGS before attempting to build, test, or run this package.** The package will fail without it.
**Install JAGS if you plan to use `run_mod()` with JAGS models.** JAGS is required for the original JAGS-based modeling functions but is not needed if you only use Stan models via `run_mod_stan()`.

#### Installing JAGS in Docker (if using rocker/verse)

Expand Down Expand Up @@ -250,6 +255,25 @@ runjags::findJAGS()
runjags::testjags()
```

### Stan Installation (OPTIONAL for Stan models)

**Install cmdstanr and CmdStan if you plan to use `run_mod_stan()`.** Stan support is optional and provides a modern alternative to JAGS.

```r
# Install cmdstanr from r-universe
install.packages("cmdstanr",
Comment thread
d-morrison marked this conversation as resolved.
repos = c("https://stan-dev.r-universe.dev",
getOption("repos")))

# Then install CmdStan
cmdstanr::install_cmdstan()

# Verify installation
cmdstanr::cmdstan_version()
```

See the [cmdstanr documentation](https://mc-stan.org/cmdstanr/) for more details and troubleshooting.

## Build and Development Workflow

### Initial Setup
Expand Down Expand Up @@ -374,10 +398,14 @@ Team members can trigger actions by commenting on PRs:

### Key Directories

- **R/**: Package source code (30 R files)
- **R/**: Package source code (30+ R files)
- `Run_Mod.R`: Main function to run JAGS Bayesian models
- `run_mod_stan.R`: Main function to run Stan Bayesian models (new)
- `as_case_data.R`: Convert data to case_data class
- `prep_data.r`, `prep_priors.R`: Data preparation for JAGS
- `prep_data.r`: Data preparation for JAGS
- `prep_data_stan.R`: Data preparation for Stan (new)
- `prep_priors.R`: Prior preparation for JAGS
- `prep_priors_stan.R`: Prior preparation for Stan (new)
- `sim_case_data.R`: Simulate case data for testing
- `post_summ.R`, `postprocess_jags_output.R`: Post-processing JAGS results
- `plot_*.R`: Diagnostic plotting functions (trace, density, Rhat, effective sample size)
Expand All @@ -397,8 +425,13 @@ Team members can trigger actions by commenting on PRs:
- **data-raw/**: Raw data processing scripts (not included in package build)

- **inst/**: Installed files
- `inst/extdata/`: JAGS model files (`model.jags`, `model.dobson.jags`), example CSV data
- `inst/extdata/`: Model files
- `model.jags`, `model.dobson.jags`: JAGS model files
- `model.stan`, `model.dobson.stan`: Stan model files (new)
- Example CSV data files
- `inst/examples/`: Example R scripts for documentation
- JAGS examples: `run_mod-examples.R`, `examples-prep_priors.R`
- Stan examples: `run_mod_stan-examples.R`, `examples-prep_priors_stan.R`, `examples-prep_data_stan.R` (new)
- `inst/WORDLIST`: Custom spelling dictionary

- **vignettes/**: Package vignettes
Expand Down Expand Up @@ -527,6 +560,7 @@ expect_false(has_missing_values(complete_data))
- **Messaging**: Use `cli::cli_*()` functions for all user-facing messages
- **No `library()` in package code**: Use `::` or DESCRIPTION Imports
- **Document all exports**: Use roxygen2 (@title, @description, @param, @returns, @examples)
- **Use markdown syntax in roxygen2**: Use markdown syntax (numbered lists: `1.`, `2.`, etc.; bullet lists: `-` or `*`) instead of Rd syntax (`\enumerate{}`, `\itemize{}`, `\item`) in roxygen2 documentation comments
- **Test snapshot changes**: Use `testthat::announce_snapshot_file()` for CSV snapshots
- **Seed tests**: Use `withr::local_seed()` for reproducible tests
- **Prefer data-first pipelines**: Design and call functions so the primary data object flows through `|>` naturally
Expand Down Expand Up @@ -581,10 +615,39 @@ These instructions have been validated against the actual repository structure,
9. **ALWAYS** run tests before committing (`devtools::test()`)
10. **ALWAYS** check and fix lintr issues in changed files in PRs before committing
11. **ALWAYS** run `devtools::document()` before requesting PR review
12. **ALWAYS** make sure `devtools::check()` passes before requesting PR review
13. **ALWAYS** make sure `devtools::spell_check()` passes before requesting PR review
14. **ALWAYS** run `pkgdown::build_site()` before requesting PR review to ensure the pkgdown site builds successfully
15. **ALWAYS** verify Quarto documents render successfully locally - don't rely on CI workflows. For vignettes and articles, test rendering with `quarto render path/to/file.qmd` or by building the full site with `pkgdown::build_site()`
16. When `pkgdown::build_site()` has errors related to Quarto, use `quarto::quarto_render(input = "path/to/file.qmd", quiet = FALSE)` to debug and see detailed error messages
12. **ALWAYS** run `lintr::lint_package()` before requesting PR review and fix all linting issues
13. **ALWAYS** run `devtools::check()` before requesting PR review and ensure it passes with 0 errors, 0 warnings, 0 notes
14. **ALWAYS** make sure `spelling::spell_check_package()` passes before requesting PR review
15. **ALWAYS** run `pkgdown::build_site()` before requesting PR review to ensure the pkgdown site builds successfully
16. **ALWAYS** verify Quarto documents render successfully locally - don't rely on CI workflows. For vignettes and articles, test rendering with `quarto render path/to/file.qmd` or by building the full site with `pkgdown::build_site()`
17. When `pkgdown::build_site()` has errors related to Quarto, use `quarto::quarto_render(input = "path/to/file.qmd", quiet = FALSE)` to debug and see detailed error messages

**CRITICAL PRE-REVIEW VALIDATION**: Before requesting PR review, you MUST run the following validation commands locally and ensure they all pass:
```r
# 1. Lint the package - must have 0 linting issues
lintr::lint_package()

# 2. Run R CMD check - must pass with 0 errors, 0 warnings, 0 notes
devtools::check()

# 3. Check spelling - must have no spelling errors
spelling::spell_check_package()

# 4. Build pkgdown site - must build successfully
pkgdown::build_site()
```

Do NOT request review if any of these checks fail. Fix all issues first, then re-run the checks to verify.

**AUTOMATED CODE REVIEW ITERATION**: After completing all changes and passing the validation checks above, you MUST iterate with the automated code review (parallel_validation tool) until it provides no further valid feedback:

1. Run `parallel_validation` with appropriate PR title, description, and triviality assessment
2. Carefully review all feedback from both Code Review and CodeQL Security Scan
3. Address all valid issues identified by the automated review
4. Re-run `parallel_validation` after making changes
5. Repeat steps 2-4 until the automated review provides no further valid feedback
6. Only then request human review

This iterative process catches issues early and ensures high code quality before human review.

Only search for additional information if these instructions are incomplete or incorrect for your specific task.
15 changes: 14 additions & 1 deletion .github/workflows/copilot-setup-steps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,19 @@ jobs:
runjags::testjags()
shell: Rscript {0}

# Install cmdstanr for Stan modeling support (optional)
# See .github/copilot-instructions.md "Stan Installation" section
- name: Install cmdstanr
run: |
install.packages("cmdstanr",
repos = c("https://stan-dev.r-universe.dev",
"https://cloud.r-project.org"),
verbose = TRUE)
cmdstanr::install_cmdstan(cores = 2)
cat("cmdstanr version:", as.character(packageVersion("cmdstanr")), "\n")
cat("CmdStan version:", cmdstanr::cmdstan_version(), "\n")
shell: Rscript {0}

# Verify JAGS installation
# See .github/copilot-instructions.md "JAGS Installation" section for details
- name: Verify JAGS installation
Expand Down Expand Up @@ -144,7 +157,7 @@ jobs:

# Display key installed packages
cat("Key installed packages:\n")
key_packages <- c("devtools", "rjags", "runjags", "rcmdcheck", "lintr", "spelling", "testthat")
key_packages <- c("devtools", "rjags", "runjags", "cmdstanr", "rcmdcheck", "lintr", "spelling", "testthat")
for (pkg in key_packages) {
if (requireNamespace(pkg, quietly = TRUE)) {
cat(" -", pkg, ":", as.character(packageVersion(pkg)), "\n")
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,5 @@ docs/

**/*.quarto_ipynb
README.html
*.knit.*
..Rcheck/
9 changes: 6 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: serodynamics
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9050
Version: 0.0.0.9051
Authors@R: c(
person("Peter", "Teunis", , "p.teunis@emory.edu", role = c("aut", "cph"),
comment = "Author of the method and original code."),
Expand All @@ -13,8 +13,8 @@ Description: Modeling longitudinal immune seroresponses to pathogens.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
URL: https://github.com/UCD-SERG/serodynamics, https://ucd-serg.github.io/serodynamics/
BugReports: https://github.com/UCD-SERG/serodynamics/issues
URL: https://github.com/ucdavis/serodynamics, https://ucdavis.github.io/serodynamics/
BugReports: https://github.com/ucdavis/serodynamics/issues
Imports:
cli,
coda,
Expand All @@ -32,6 +32,8 @@ Imports:
tidyselect,
utils
Suggests:
badger,
cmdstanr,
Hmisc,
Comment thread
d-morrison marked this conversation as resolved.
knitr,
readr,
Expand Down Expand Up @@ -62,6 +64,7 @@ Config/needs/test:
rlist
Config/needs/check: commonmark, xml2
Remotes:
stan-dev/cmdstanr,
ucd-serg/serocalculator
Depends:
R (>= 4.1.0)
Expand Down
6 changes: 6 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,12 @@ export(plot_predicted_curve)
export(post_summ)
export(postprocess_jags_output)
export(prep_data)
export(prep_data_stan)
export(prep_priors)
export(prep_priors_stan)
export(run_mod)
export(run_mod_stan)
export(sample_predictive_stan)
export(serodynamics_example)
export(sim_case_data)
export(sim_n_obs)
Expand Down Expand Up @@ -45,6 +49,8 @@ importFrom(serocalculator,get_values)
importFrom(serocalculator,ids)
importFrom(serocalculator,ids_varname)
importFrom(stats,complete.cases)
importFrom(stats,median)
importFrom(stats,quantile)
importFrom(stats,rnorm)
importFrom(tidyr,pivot_wider)
importFrom(utils,read.csv)
9 changes: 9 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# serodynamics (development version)

## New features

* Added Stan support as an alternative to JAGS for Bayesian modeling
* New `run_mod_stan()` function for fitting models with Stan/cmdstanr
* New `prep_data_stan()` function to prepare data in Stan format
* New `prep_priors_stan()` function to prepare priors for Stan models
* New Stan model files: `inst/extdata/model.stan` and `inst/extdata/model.dobson.stan`
* Stan support is optional (cmdstanr in Suggests) and can be used alongside JAGS

* Expanded `.github/copilot-instructions.md` with additional guidance on evidence-based claims, Quarto markdown/cross-reference conventions, R style practices, and phrase-level line-break formatting for source text.
* Fixed `dplyr::as_tibble()` references to `tibble::as_tibble()` in `post_summ()` and `run_mod()`, since `as_tibble()` is exported from the `tibble` package, not `dplyr`.
* Added R 4.5+ snapshot variants to handle the changed attribute ordering in
Expand Down
143 changes: 143 additions & 0 deletions R/prep_data_stan.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
#' Prepare data for Stan
#'
#' @param dataframe a [data.frame] containing case data
#' @param biomarker_column [character] string indicating
#' which column contains antigen-isotype names
#' @param verbose whether to produce verbose messaging
#'
#' @returns a `prepped_stan_data` object (a [list] with Stan-formatted data)
#' @export
#'
#' @examples
#' set.seed(1)
#' raw_data <-
#' serocalculator::typhoid_curves_nostrat_100 |>
#' sim_case_data(n = 5)
#' prepped_data <- prep_data_stan(raw_data)
#'
#' @seealso [sample_predictive_stan()] for posterior predictive
#' sampling with Stan models
prep_data_stan <- function(
dataframe,
biomarker_column = get_biomarker_names_var(dataframe),
verbose = FALSE) {

# First use existing prep_data function to get the base structure
jags_data <- prep_data(
dataframe = dataframe,
biomarker_column = biomarker_column,
verbose = verbose,
add_newperson = FALSE # Force FALSE for Stan
)

Comment thread
d-morrison marked this conversation as resolved.
# Check for NA values in the original input data (Stan cannot handle NA)
# Note: jags_data arrays are padded with NA, so check original dataframe
value_var <- serocalculator::get_values_var(dataframe)
timeindays_var <- get_timeindays_var(dataframe)

if (any(is.na(dataframe[[value_var]])) ||
any(is.na(dataframe[[timeindays_var]]))) {
cli::cli_abort(
Comment thread
d-morrison marked this conversation as resolved.
c(
"Stan data cannot contain NA values.",
"i" = paste(
"The input data contains missing antibody measurements",
"or time points."
),
"i" = "Stan requires complete data for all observations.",
"i" = paste(
"Consider removing subjects/visits with missing data",
"or imputing values."
)
)
)
}

# Convert to Stan format
# Stan requires explicit max dimensions
# Validate that we have at least one subject with observations
if (length(jags_data$nsmpl) == 0 || all(jags_data$nsmpl == 0)) {
cli::cli_abort(
c(
"No observations found in input data.",
"i" = "Stan models require at least one subject with observations.",
"i" = "Check that your input data is not empty."
)
)
}

max_nsmpl <- max(jags_data$nsmpl)

# Create padded arrays (Stan doesn't handle ragged arrays like JAGS)
# We need to pad smpl.t and logy to max_nsmpl
nsubj <- jags_data$nsubj
n_antigen_isos <- jags_data$n_antigen_isos

# Initialize with zeros (will be ignored in model for obs > nsmpl[subj])
smpl_t_padded <- array(0, dim = c(nsubj, max_nsmpl))
logy_padded <- array(0, dim = c(nsubj, max_nsmpl, n_antigen_isos))

Comment thread
d-morrison marked this conversation as resolved.
# Fill in actual data and validate no NA values in the arrays
for (subj in 1:nsubj) {
n_obs <- jags_data$nsmpl[subj]
if (n_obs > 0) {
# Validate smpl.t has no NA values for this subject's observations
subj_times <- jags_data$smpl.t[subj, 1:n_obs]
if (any(is.na(subj_times))) {
cli::cli_abort(
c(
"Stan data cannot contain NA values in time points.",
"i" = "Subject {subj} has NA values in observation times.",
"i" = "Stan requires complete data for all observations."
)
)
}
smpl_t_padded[subj, 1:n_obs] <- subj_times

# Validate and copy logy values for each antigen
for (k in 1:n_antigen_isos) {
subj_logy <- jags_data$logy[subj, 1:n_obs, k]
if (any(is.na(subj_logy))) {
cli::cli_abort(
c(
"Stan data cannot contain NA values in antibody measurements.",
"i" = paste(
"Subject {subj}, antigen {k} has NA values in",
"log(antibody)."
),
"i" = paste(
"This can occur when a subject/visit exists but a particular",
"antigen-isotype measurement is missing."
),
"i" = "Stan requires complete data for all observations."
)
)
}
logy_padded[subj, 1:n_obs, k] <- subj_logy
}
}
}

stan_data <- list(
nsubj = nsubj,
n_antigen_isos = n_antigen_isos,
n_params = 5, # y0, y1, t1, alpha, shape
nsmpl = as.integer(jags_data$nsmpl),
max_nsmpl = as.integer(max_nsmpl),
smpl_t = smpl_t_padded,
logy = logy_padded
)

# Add attributes from JAGS data
# Store antigens in a consistent order for use in predictions
antigens_attr <- attributes(jags_data)$antigens
stan_data <- stan_data |>
structure(
class = c("prepped_stan_data", "list"),
antigens = antigens_attr,
n_antigens = attributes(jags_data)$n_antigens,
ids = attributes(jags_data)$ids
)

return(stan_data)
}
Loading
Loading