-
Notifications
You must be signed in to change notification settings - Fork 0
Add Stan support as alternative Bayesian modeling backend #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Copilot
wants to merge
34
commits into
main
Choose a base branch
from
copilot/convert-jags-model-to-stan
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
6b44d2f
Initial plan
Copilot 368fb4d
Add initial Stan implementation with model files and R functions
Copilot 5d88e45
Add examples, fix linting issues, and update WORDLIST for Stan functions
Copilot 530fc7f
Update README and copilot instructions to document Stan support
Copilot e973b69
Add *.knit.* to .gitignore and install cmdstanr in copilot-setup-step…
Copilot d5ad3ce
Wrap run_mod_stan examples in \dontrun{} to fix R CMD check failure
Copilot fc3a564
Merge branch 'main' into copilot/convert-jags-model-to-stan
Copilot 1a91c47
Merge branch 'main' into copilot/convert-jags-model-to-stan
d-morrison d71bb31
Refactor to extract shared code between JAGS and Stan functions
Copilot 0fe0ef5
Fix review comments: draws indexing, stan_fit storage, NA handling, n…
Copilot 28e7a9d
Fix review comments: README list formatting, tibble::as_tibble, facto…
Copilot deb065a
Fix review comments: NA checking, prior validation, example format, b…
Copilot cf5d7f6
Fix review comments: add badger to Suggests, handle NA strat levels, …
Copilot d8599f8
Fix open review comments: update install URL, fix factor indexing, cl…
Copilot c1f61ad
Add comprehensive tests for Stan functions (prep_data_stan, prep_prio…
Copilot 18b91e6
Fix CI failures: lint issues, test warnings, and pkgdown reference index
Copilot 6bfa6ba
Add explicit pre-review validation requirements to copilot-instructio…
Copilot a31a95c
Fix review comments: remove nburn/nmc from tests, fix add_newperson d…
Copilot e8ceafb
Remove add_newperson parameter from prep_data_stan, add sample_predic…
Copilot 951468c
Fix review comments: stratification validation, calc_fit_mod per-stra…
Copilot b1e4701
Fix review comments: rewrite sample_predictive_stan, fix docs, remove…
Copilot 2c9e7c3
Address code review feedback: clarify Subnum indexing and rename popu…
Copilot 2798684
Fix linting errors: line length issues in prep_priors.R and test-stan…
Copilot ff8441d
Fix review comments: remove duplicate n_params, use population-level …
Copilot 3db4454
Fix mu_par indexing, add empty data validation, add multi-antigen test
Copilot cd36771
Fix sample_predictive_stan: transform mu_par from log scale, compute …
Copilot 01af5c0
Changes before error encountered
Copilot 3730197
Use markdown syntax instead of Rd syntax in roxygen2 documentation
Copilot b40d15a
Address review comments: NA validation, antigen ordering, markdown sy…
Copilot 4360bad
Fix antigen attribute access: use plain vector, not $Iso_type
Copilot f0320a5
Add NA stratification validation and relax nchain constraint
Copilot b5c4af4
Fix misleading error message for empty stratification list
Copilot a28ba17
Add validation for max_antigens and CmdStan installation check
Copilot 6a60ad4
Update documentation for with_post parameter and sample_predictive_stan
Copilot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,3 +28,5 @@ docs/ | |
|
|
||
| **/*.quarto_ipynb | ||
| README.html | ||
| *.knit.* | ||
| ..Rcheck/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| #' Prepare data for Stan | ||
| #' | ||
| #' @param dataframe a [data.frame] containing case data | ||
| #' @param biomarker_column [character] string indicating | ||
| #' which column contains antigen-isotype names | ||
| #' @param verbose whether to produce verbose messaging | ||
| #' | ||
| #' @returns a `prepped_stan_data` object (a [list] with Stan-formatted data) | ||
| #' @export | ||
| #' | ||
| #' @examples | ||
| #' set.seed(1) | ||
| #' raw_data <- | ||
| #' serocalculator::typhoid_curves_nostrat_100 |> | ||
| #' sim_case_data(n = 5) | ||
| #' prepped_data <- prep_data_stan(raw_data) | ||
| #' | ||
| #' @seealso [sample_predictive_stan()] for posterior predictive | ||
| #' sampling with Stan models | ||
| prep_data_stan <- function( | ||
| dataframe, | ||
| biomarker_column = get_biomarker_names_var(dataframe), | ||
| verbose = FALSE) { | ||
|
|
||
| # First use existing prep_data function to get the base structure | ||
| jags_data <- prep_data( | ||
| dataframe = dataframe, | ||
| biomarker_column = biomarker_column, | ||
| verbose = verbose, | ||
| add_newperson = FALSE # Force FALSE for Stan | ||
| ) | ||
|
|
||
|
d-morrison marked this conversation as resolved.
|
||
| # Check for NA values in the original input data (Stan cannot handle NA) | ||
| # Note: jags_data arrays are padded with NA, so check original dataframe | ||
| value_var <- serocalculator::get_values_var(dataframe) | ||
| timeindays_var <- get_timeindays_var(dataframe) | ||
|
|
||
| if (any(is.na(dataframe[[value_var]])) || | ||
| any(is.na(dataframe[[timeindays_var]]))) { | ||
| cli::cli_abort( | ||
|
d-morrison marked this conversation as resolved.
|
||
| c( | ||
| "Stan data cannot contain NA values.", | ||
| "i" = paste( | ||
| "The input data contains missing antibody measurements", | ||
| "or time points." | ||
| ), | ||
| "i" = "Stan requires complete data for all observations.", | ||
| "i" = paste( | ||
| "Consider removing subjects/visits with missing data", | ||
| "or imputing values." | ||
| ) | ||
| ) | ||
| ) | ||
| } | ||
|
|
||
| # Convert to Stan format | ||
| # Stan requires explicit max dimensions | ||
| # Validate that we have at least one subject with observations | ||
| if (length(jags_data$nsmpl) == 0 || all(jags_data$nsmpl == 0)) { | ||
| cli::cli_abort( | ||
| c( | ||
| "No observations found in input data.", | ||
| "i" = "Stan models require at least one subject with observations.", | ||
| "i" = "Check that your input data is not empty." | ||
| ) | ||
| ) | ||
| } | ||
|
|
||
| max_nsmpl <- max(jags_data$nsmpl) | ||
|
|
||
| # Create padded arrays (Stan doesn't handle ragged arrays like JAGS) | ||
| # We need to pad smpl.t and logy to max_nsmpl | ||
| nsubj <- jags_data$nsubj | ||
| n_antigen_isos <- jags_data$n_antigen_isos | ||
|
|
||
| # Initialize with zeros (will be ignored in model for obs > nsmpl[subj]) | ||
| smpl_t_padded <- array(0, dim = c(nsubj, max_nsmpl)) | ||
| logy_padded <- array(0, dim = c(nsubj, max_nsmpl, n_antigen_isos)) | ||
|
|
||
|
d-morrison marked this conversation as resolved.
|
||
| # Fill in actual data and validate no NA values in the arrays | ||
| for (subj in 1:nsubj) { | ||
| n_obs <- jags_data$nsmpl[subj] | ||
| if (n_obs > 0) { | ||
| # Validate smpl.t has no NA values for this subject's observations | ||
| subj_times <- jags_data$smpl.t[subj, 1:n_obs] | ||
| if (any(is.na(subj_times))) { | ||
| cli::cli_abort( | ||
| c( | ||
| "Stan data cannot contain NA values in time points.", | ||
| "i" = "Subject {subj} has NA values in observation times.", | ||
| "i" = "Stan requires complete data for all observations." | ||
| ) | ||
| ) | ||
| } | ||
| smpl_t_padded[subj, 1:n_obs] <- subj_times | ||
|
|
||
| # Validate and copy logy values for each antigen | ||
| for (k in 1:n_antigen_isos) { | ||
| subj_logy <- jags_data$logy[subj, 1:n_obs, k] | ||
| if (any(is.na(subj_logy))) { | ||
| cli::cli_abort( | ||
| c( | ||
| "Stan data cannot contain NA values in antibody measurements.", | ||
| "i" = paste( | ||
| "Subject {subj}, antigen {k} has NA values in", | ||
| "log(antibody)." | ||
| ), | ||
| "i" = paste( | ||
| "This can occur when a subject/visit exists but a particular", | ||
| "antigen-isotype measurement is missing." | ||
| ), | ||
| "i" = "Stan requires complete data for all observations." | ||
| ) | ||
| ) | ||
| } | ||
| logy_padded[subj, 1:n_obs, k] <- subj_logy | ||
| } | ||
| } | ||
| } | ||
|
|
||
| stan_data <- list( | ||
| nsubj = nsubj, | ||
| n_antigen_isos = n_antigen_isos, | ||
| n_params = 5, # y0, y1, t1, alpha, shape | ||
| nsmpl = as.integer(jags_data$nsmpl), | ||
| max_nsmpl = as.integer(max_nsmpl), | ||
| smpl_t = smpl_t_padded, | ||
| logy = logy_padded | ||
| ) | ||
|
|
||
| # Add attributes from JAGS data | ||
| # Store antigens in a consistent order for use in predictions | ||
| antigens_attr <- attributes(jags_data)$antigens | ||
| stan_data <- stan_data |> | ||
| structure( | ||
| class = c("prepped_stan_data", "list"), | ||
| antigens = antigens_attr, | ||
| n_antigens = attributes(jags_data)$n_antigens, | ||
| ids = attributes(jags_data)$ids | ||
| ) | ||
|
|
||
| return(stan_data) | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.