feat: create r script from settings file#835
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces functionality to generate reproducible R scripts from Shiny app sessions and settings files, enabling users to replicate NCA analysis outputs outside the app. Additionally, it includes enhancements to the pivoting functionality and bug fixes for ratio calculations.
Key Changes
- Added
get_session_code()andget_settings_code()functions with supportingclean_deparse()methods to generate R scripts from session data - Enhanced
pivot_wider_pknca_results()to accept flag rules and extra variables, with flagging logic moved into the function - Fixed
calculate_table_ratios_app()to handle NULL ratio_table by wrapping withas.data.frame()
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| R/get_session_code.R | New file implementing script generation from session data with clean_deparse methods for various R objects |
| R/ratio_calculations.R | Moved ratio calculation functions from inst/shiny/functions and added as.data.frame() wrapping for NULL handling |
| R/pivot_wider_pknca_results.R | Enhanced to accept flag_rules and extra_vars_to_keep parameters, moved flagging logic into function |
| inst/shiny/www/templates/script_template.R | New template file for generating executable R scripts from session data |
| inst/shiny/modules/tab_nca/*.R | Updated to store session data (settings, ratio_table, slope_rules, final_units) for script generation |
| inst/shiny/modules/tab_data/*.R | Updated to store data_path, mapping, and applied_filters in session$userData |
| tests/testthat/test-get_session_code.R | New tests for clean_deparse() helper function |
| tests/testthat/test-pivot_wider_pknca_results.R | Added tests for flag_rules and extra_vars_to_keep functionality |
| man/*.Rd | Documentation updates for new and modified exported functions |
| NAMESPACE | Added exports for get_session_code and calculate_table_ratios_app |
| NEWS.md | Updated to document the new R script export feature |
| inst/WORDLIST | Added new technical terms used in documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…atio_calculations.R >> accept incoming) remote-tracking branch 'origin/467-enhancement/r-script' into 826-enhancement-create-r-script-from-settings-file
…e-r-script-from-settings-file
…e-r-script-from-settings-file
…ith-session-settings-r-script' into 826-enhancement-create-r-script-from-settings-file
…ge template placeholder (session$userData > yaml_setts)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #' installed from your aNCA package version. | ||
| #' @param output_path Path to write the resulting script file. | ||
| #' | ||
| #' @importFrom yaml read_yaml |
There was a problem hiding this comment.
The function-level roxygen includes @importFrom yaml read_yaml, but get_settings_code() does not call read_yaml() directly (it calls read_settings()). Consider removing this import to avoid unused-import lintr/R CMD check issues, or move the import to the function that actually uses it.
| #' @importFrom yaml read_yaml |
|
|
||
| describe("get_session_code: ", { | ||
| setts_file <- testthat::test_path("data/test-settings.yaml") | ||
| data_file <- testthat::test_path("data/test-multispec-ADNCA.csv") |
There was a problem hiding this comment.
output_file is created inside the describe("get_settings_code") block but then reused in the later describe("get_session_code") block without being defined there. describe() evaluates in its own environment, so output_file may be out of scope and cause the later tests to error. Define output_file in each describe() (or use withr::local_tempfile() inside each it()).
| data_file <- testthat::test_path("data/test-multispec-ADNCA.csv") | |
| data_file <- testthat::test_path("data/test-multispec-ADNCA.csv") | |
| output_file <- tempfile(fileext = ".R") |
| \arguments{ | ||
| \item{settings_file_path}{Path to the RDS file containing the settings list.} | ||
|
|
||
| \item{data_path}{Path to the data file to be referenced in the script.} |
There was a problem hiding this comment.
The docs say settings_file_path points to an RDS file, but get_settings_code() calls read_settings() which reads YAML. Please update the argument documentation to reference a YAML settings file (and keep it consistent with tests using test-settings.yaml).
| \item{template_path}{Path to the R script template file. By default, uses the one | ||
| installed from your aNCA package version.} | ||
|
|
||
| \item{mapping}{Named list mapping variable names (default: \code{default_mapping}).} |
There was a problem hiding this comment.
ratio_table appears in the usage signature but is missing from the \arguments section. Please document ratio_table (and any other arguments that users can pass) so the generated Rd is complete.
| \item{mapping}{Named list mapping variable names (default: \code{default_mapping}).} | |
| \item{mapping}{Named list mapping variable names (default: \code{default_mapping}).} | |
| \item{ratio_table}{Optional data frame specifying ratio information used when | |
| generating the session script.} |
| it("writes a script R file output", { | ||
| get_settings_code( | ||
| settings_file_path = setts_file, | ||
| data_path = data_file, | ||
| output_path = output_file | ||
| ) | ||
| # Check if the file was created | ||
| expect_true(file.exists(output_file)) | ||
| }) |
There was a problem hiding this comment.
These tests only assert that an output file exists, but not that placeholders were substituted correctly (e.g., that mapping/settings were written instead of NULL). Adding an assertion that the generated script contains expected lines/values would prevent regressions like NULL substitution bugs.
| get_settings_code <- function( | ||
| settings_file_path, | ||
| data_path, | ||
| output_path = "settings_code.R", | ||
| template_path = system.file("shiny/www/templates/script_template.R", package = "aNCA"), | ||
| # TODO: mapping & ratio_table should be included in the settings file as well | ||
| # so they keep working as expected also from the settings file | ||
| mapping = default_mapping, | ||
| ratio_table = data.frame() | ||
| ) { | ||
| settings <- read_settings(settings_file_path) | ||
| session <- list(yaml_setts = list( | ||
| settings = settings[["settings"]], | ||
| slope_rules = settings[["slope_rules"]], | ||
| data_path = data_path, | ||
| mapping = mapping, | ||
| ratio_table = ratio_table | ||
| )) |
There was a problem hiding this comment.
get_settings_code takes settings from a YAML file (read_settings(settings_file_path)) and feeds them into get_code, which serializes them into an R script using clean_deparse. The clean_deparse.character implementation builds string literals via sprintf('"%s"', obj) without escaping embedded quotes or other metacharacters, so an attacker controlling any string in the YAML (e.g., a mapping value) can inject R code into the generated script by including a " and additional statements (for example a value like "; system('id'); #). When someone later sources the generated script, that injected code will execute; to fix this, ensure all character values are converted into valid R string literals (e.g., via a safe escaping routine or deparse) before substitution, or restrict get_settings_code to trusted, non-user-controlled inputs.
js3110
left a comment
There was a problem hiding this comment.
Im still reviewing, so won't approve or request changes yet, but just a quick comment regarding style: setts is not an appropriate shortening of settings imo- it's not a standard abbreviation, and can be misinterpreted as sets is also a word. settings is more clear and only 4 characters longer, so I would suggest to rename all instances of setts to settings, eg yaml_settings, as this is much more clear.
js3110
left a comment
There was a problem hiding this comment.
I ran the app and created the two code versions, but for both I get the same error because mapping is being initialised as NULL
Console output:
> mapping <- NULL
> names(mapping) <- gsub("select_", "", names(mapping))
Error in names(mapping) <- gsub("select_", "", names(mapping)) :
attempt to set an attribute on NULL
> applied_filters <- NULL
> preprocessed_adnca <- adnca_data %>%
+
+ # Filter the data
+ apply_filters(applied_filters) %>%
+
+ # Map columns to their standards
+ apply_mapping(
+ mapping = mapping,
+ desired_order = c(
+ "STUDYID", "USUBJID", "PARAM", "PCSPEC", "ATPTREF",
+ "AVAL", "AVALU", "AFRLT", "ARRLT", "NRRLT", "NFRLT",
+ "RRLTU", "ROUTE", "DOSETRT", "DOSEA", "DOSEU", "ADOSEDUR",
+ "VOLUME", "VOLUMEU", "TRTRINT", "METABFL"
+ ),
+ silent = FALSE
+ ) %>%
+
+ # Derive METABFL column using PARAM metabolites
+ create_metabfl(mapping$Metabolites) %>%
+
+ # Make sure all variables are in its correct class
+ adjust_class_and_length(metadata_nca_variables)
* ->
Error in if (mapping$DOSETRT == "") { : argument is of length zero
In addition: Warning message:
In apply_mapping(., mapping = mapping, desired_order = c("STUDYID", :
Dose duration is assumed to be 0 for all records (ADOSEDUR = 0)
I used a preclinical dataset.
|
should be fixed now @js3110 |
js3110
left a comment
There was a problem hiding this comment.
Works well for me!
I will approve, but I think it would be good to add an example or a template for the mapping in the documentation, so its easy for the user to know what they should add. Currently it just says "default: default_mapping) but its not clear what that is or what it looks like.
I had to search the code to find the default_mapping, but a simple user won't necessarily have access to that.
m-kolomanski
left a comment
There was a problem hiding this comment.
Hey, I have found some non-blocking issues, otherwise code looks fine.
I am gonna attach my complimentary warning about about coupling the package code with the application even more (with the default_mapping list). It is the application that should parse its own session object and provide the get_code() function with a list that is already in a common format - the package should not care how we named the inputs in the Shiny application. But this seems to be a design choice lately so.
suggestion/nitpick: By naming the object yaml_settings we are coupling this usage to the settings implementation (by name only, ofc). I know that we are probably not gonna migrate the settings to a different format, but if we ever wanted to move it to JSON or TOML or whatever, then this yaml_settings name would stop making sense and need refactoring. There is no reason to do that - neither the template script, not get_code() care what format the settings were parsed from - they receive a list.
| script_template_path <- "shiny/www/templates/script_template.R" | ||
| get_session_code( | ||
| template_path = script_template_path, | ||
| template_path = system.file(script_template_path, package = "aNCA"), |
There was a problem hiding this comment.
suggestion: I would consider moving the template to just /inst/ folder as opposed to nesting it within /shiny/
There was a problem hiding this comment.
good catch. I will open a new issue to also move template.pptx from there (#965)
|
|
||
| #' Generate a session script from settings and mapping files | ||
| #' | ||
| #' This function reads a settings RDS file and data path, and generates an R script |
There was a problem hiding this comment.
issue(non-blocking): The docs say RDS, but in the code the settings file is referenced as yaml.
| #' | ||
| #' @param settings_file_path Path to the RDS file containing the settings list. | ||
| #' @param data_path Path to the data file to be referenced in the script. | ||
| #' @param template_path Path to the R script template file. By default, uses the one |
There was a problem hiding this comment.
issue(non-blocking): Docs say the function uses template file by default, but the default is NULL.
| if (is.null(template_path)) { | ||
| template_path = system.file("shiny/www/templates/script_template.R", package = "aNCA") | ||
| } |
There was a problem hiding this comment.
issue(non-blocking): If we want this to be the default, why not make it the default argument? We don't need this NULL check.
There was a problem hiding this comment.
I thought for the user would be weird to read system.file... in the man but you are right
| #' @importFrom yaml read_yaml | ||
| #' @return Invisibly returns the output_path. | ||
| #' @export | ||
| get_settings_code <- function( |
There was a problem hiding this comment.
question: The function name seems misleading to me. It seems like we should receive something related to the settings? But from the code it looks like we are receiving the template / target analysis script, not just settings. Am I missing something?
There was a problem hiding this comment.
hmm you are right. the logic here of course is that the settings file provides the key elements to produce the specific code that can replicate the App actions.
However, I understand your point that it might be confusing because indeed the true code essence is in the template. So in the end what this function does is populate the relevant inputs/key-variables on it. I have to keep thinking about this to give it a better name in the future, because I am not sure
@m-kolomanski Would it be sufficient if I just move the function |
maybe I can rename it again to something different like |
|
@Gero1999 Not quite, because it seems the default mapping is referencing raw input names ( mapping <- yaml_settings$mapping
names(mapping) <- gsub("select_", "", names(mapping))which means it is coupled with the Then |
…zip-utils.R remote-tracking branch 'origin/main' into 826-enhancement-create-r-script-from-settings-file
…te.R) so slope rules logic is adapted to the new simplified logic remote-tracking branch 'origin/main' into 826-enhancement-create-r-script-from-settings-file
Issue
Closes #826
Description
This PR refactors and extends the code generation utilities for replicating app sessions, introducing a new
get_settings_code()function to generate scripts from settings files, and updating the template and related helpers to use a more generalyaml_settsstructure instead of directsession$userDatareferences. It also improves robustness when handlingtibbleobjects and updates documentation and tests accordingly.Major improvements and changes:
Added
get_settings_code()to generate a reproducible R script from a settings RDS file and data path, using a mapping and template, making it possible to script session replication without a running Shiny session. Objects that are still not part of the settings but are needed for the r-script (ratios_table,mapping) are for now set by default to standard values and need to be manually declared by the user.Refactored code and template references from
session$userDatatoyaml_setts, making code generation more flexible and decoupled from Shiny internals. This includes changes in the script template and all code substitution logic.Added new tests for
get_settings_code()andget_session_code()to verify script file output.Related to other PR to be merged
Enhanced
clean_deparse()to treattbl_df(tibble) objects as data frames for serialization, ensuring consistent output regardless of input data frame type. With its corresponding tests.read_settings()now ensurestypes_dfis always a data frame, improving downstream robustness.Definition of Done
How to test
r-scriptsettings.yaml)get_settings_code()in your R console on the settings file. Indicate all inputs apart from the settings file as they were in the App (ratio_table,mapping).r-scriptmatches the one downloaded in the AppNote:
ratio_tableandmappingare aspects still not covered by the settings file. Therefore, they need for now to be explicitly declared in the functionget_settings_code()Contributor checklist
Notes to reviewer
Better to review/merge first #789 ✔️