pmsims: Simulation-based Sample Size Tools for Prediction Models

pmsims is an R package for estimating how much data are needed to develop reliable and generalisable prediction models. It uses a simulation-based learning curve approach to quantify how model performance improves with increasing sample size, supporting principled study planning and feasibility assessment.

The package is fully model-agnostic: users can define how data are generated, how models are fitted, and how predictive performance is measured. It currently supports regression-based prediction models with continuous, binary, and time-to-event outcomes.

pmsims also includes experimental machine-learning model options via regularised regression, random forest, and XGBoost. These options have not yet undergone the package’s main validation study and should be treated as experimental in 0.5.0.

Developed at King’s College London (Department of Biostatistics & Health Informatics) with input from researchers, clinicians, and patient partners. See the pmsims project site for further details.

Installation

Install version 0.5.0 from GitHub:

# install.packages("remotes")
remotes::install_github("pmsims-package/pmsims", ref = "v0.5.0")

Minimal example

library(pmsims)
set.seed(123)

binary_example <- simulate_binary(
  signal_parameters = 15,
  noise_parameters  = 0,
  predictor_type = "continuous",
  binary_predictor_prevalence = NULL,
  outcome_prevalence = 0.20,
  maximum_achievable_cstatistic = 0.80,
  model = "glm",
  metric = "calibration_slope",
  target_performance = 0.90,
  n_reps_total = 1000,
  mean_or_assurance = "assurance"
)

binary_example

maximum_achievable_cstatistic and target_performance have different roles:

maximum_achievable_cstatistic represents the best plausible C-statistic with effectively unlimited data and calibrates the data generator.
target_performance is the minimum acceptable metric value used to determine the required sample size.

Citing pmsims

If you use pmsims, please cite the package and either or both accompanying preprints.

Current preprints:

Shamsutdinova D, Zimmer F, Olaniran OR, Markham S, Stahl D, Forbes G, Carr E (2026). Sample Size Calculations for Developing Clinical Prediction Models: Overview and pmsims R package. arXiv. https://arxiv.org/abs/2602.23507
Olaniran OR, Shamsutdinova D, Markham S, Zimmer F, Stahl D, Forbes G, Carr E (2026). Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package. arXiv. https://arxiv.org/abs/2603.23688

Once peer-reviewed articles are available, these citations should be updated to the published versions. In R, you can retrieve the package citation with:

citation("pmsims")

Get in touch

We welcome questions, suggestions, and collaboration enquiries.

Email: pmsims@kcl.ac.uk
Feedback or bugs: please open a GitHub issue

Funding

This work is supported by the National Institute for Health and Care Research (NIHR) under the Research for Patient Benefit (RfPB) Programme (NIHR206858).

The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.

Name		Name	Last commit message	Last commit date
Latest commit History 576 Commits
.github		.github
R		R
data-raw		data-raw
docker		docker
docs		docs
inst		inst
man		man
pkgdown		pkgdown
pmsims.Rcheck		pmsims.Rcheck
release-notes		release-notes
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
air.toml		air.toml
codecov.yml		codecov.yml
pmsims_0.5.0.tar.gz		pmsims_0.5.0.tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pmsims: Simulation-based Sample Size Tools for Prediction Models

Installation

Minimal example

Citing pmsims

Get in touch

Funding

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pmsims: Simulation-based Sample Size Tools for Prediction Models

Installation

Minimal example

Citing pmsims

Get in touch

Funding

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages