FastSurvival

FastSurvival provides fast alternatives to the standard survival analysis functions in the survival package. Every function is designed for repeated evaluation inside large simulation loops — adaptive sample-size re-estimation, probability-of-success calculations, regional consistency evaluation in multi-regional trials — where the standard iterative or object-building overhead of survfit(), survdiff(), and coxph() becomes a computational bottleneck. Core computations are implemented in C++ via Rcpp for maximum performance.

Functions

Function	Replaces	Approximate speed gain
`survfit_fast()`	`survfit()` + `summary()` at a single time point	~50x
`survdiff_fast()`	`survdiff()`	~40x
`coxph_fast()`	`coxph()` (point estimate + Wald CI)	~30x
`simdata_fast()`	Custom simulation scripts	—

Speed gains are based on median times from 1,000 microbenchmark replicates on a typical phase-3 trial dataset (n = 600, event rate 80%) with \code{presorted = TRUE}. Results vary by hardware and sample size.

Installation

# Install from GitHub
# install.packages("remotes")
remotes::install_github("gosukehommaEX/FastSurvival")

Quick start

library(FastSurvival)
library(survival)

# ----------------------------------------------------------------
# survfit_fast(): Kaplan-Meier at a single time point
# ----------------------------------------------------------------
ord <- order(ovarian$futime)
t_s <- ovarian$futime[ord]
e_s <- ovarian$fustat[ord]

survfit_fast(t_s, e_s, t_eval = 500, conf.type = "log")

# ----------------------------------------------------------------
# survdiff_fast(): log-rank test
# ----------------------------------------------------------------
survdiff_fast(ovarian$futime, ovarian$fustat, ovarian$rx,
              control = 1, side = 2)

# ----------------------------------------------------------------
# coxph_fast(): hazard ratio via the Pike-Halley Estimator (closed-form)
# ----------------------------------------------------------------
coxph_fast(ovarian$futime, ovarian$fustat, ovarian$rx, control = 1)

# ----------------------------------------------------------------
# simdata_fast(): clinical trial simulation
# ----------------------------------------------------------------

# Two-group trial, simple exponential, no dropout
df <- simdata_fast(
  nsim     = 1000,
  n        = c(100, 100),
  a.time   = c(0, 12),
  a.rate   = 200 / 12,
  e.median = list(18, 24),
  seed     = 1
)
head(df)

# Two-group trial, piecewise exponential (delayed treatment effect)
df2 <- simdata_fast(
  nsim     = 1000,
  n        = c(100, 100),
  a.time   = c(0, 12),
  a.rate   = 200 / 12,
  e.hazard = list(c(0.08, 0.08), c(0.08, 0.04)),
  e.time   = c(0, 6, Inf),
  seed     = 2
)

The three analysis functions return S3-class objects (survfit_fast, survdiff_fast, coxph_fast) with print() methods that display the results in a format similar to the corresponding survival package functions. Each object is internally a numeric vector, so it can be used directly in arithmetic, subsetting, and aggregation after stripping the class (see the simulation example below).

Design principles

survfit_fast evaluates the Kaplan-Meier estimator at a single specified time point. A C++ backend locates the evaluation cutoff via binary search, then accumulates the Kaplan-Meier product and the Greenwood variance sum in a single scan over event positions, without constructing intermediate vectors. This makes it orders of magnitude faster than survfit() plus summary() when the same sorted data are evaluated repeatedly at a fixed landmark time inside a simulation loop.

survdiff_fast computes the log-rank statistic using a two-pointer merge scan over the pooled sorted data, walking the time axis once while maintaining per-group at-risk counters and processing tied event times atomically. The C++ backend avoids the rank construction, tabulate(), and reverse cumulative sum operations of the standard R implementation, and the function itself bypasses the S3 object construction and formula parsing of survdiff(). It returns either a one-sided Z-score (side = 1) or a two-sided chi-square statistic (side = 2).

coxph_fast implements the Pike-Halley Estimator proposed by Homma (2025), a closed-form approximation to the Cox partial likelihood maximizer. The estimator anchors at the Pike closed-form estimate and applies a single analytic Halley correction to the Cox score, giving residual error of order O_p(n^{-3/2}) relative to the Cox maximum likelihood estimate. On the pharmacoSmoking dataset (tie rate 77.5%), the Pike-Halley Estimator reproduces the Breslow-based Cox estimate to within on the order of 1e-08. The Wald confidence interval uses the observed information at the Pike anchor as the variance estimate. The C++ backend performs group splitting, at-risk counting, and per-distinct-event-time accumulation in a single pass.

simdata_fast generates individual patient data for one- or two-group time-to-event trials. Accrual times follow a piecewise uniform distribution. Survival and dropout times follow either a simple or piecewise exponential distribution, selected automatically based on whether a scalar or vector hazard is supplied. C++ backends handle piecewise sampling and two-group interleaving, and random number generation uses dqrng for speed.

Using the functions together

A typical simulation workflow combines all four functions. Inside the loop the returned objects can be stored as-is. When aggregating results across simulations, strip the S3 class with unclass() so that rbind() produces an ordinary numeric matrix:

library(FastSurvival)

df <- simdata_fast(
  nsim     = 1000,
  n        = c(100, 100),
  a.time   = c(0, 12),
  a.rate   = 200 / 12,
  e.hazard = list(0.08, 0.05),
  d.hazard = list(0.01, 0.01),
  seed     = 42
)

results <- vector("list", 1000L)
for (s in seq_len(1000L)) {
  d <- df[df$sim == s, ]
  results[[s]] <- coxph_fast(d$tte, d$event, d$group, control = 1)
}

hr_mat <- do.call(rbind, lapply(results, unclass))
colMeans(hr_mat)

References

Homma, G. (2025). One step from Pike to Cox: a closed-form hazard ratio estimator. Manuscript under review.

Collett, D. (2014). Modelling Survival Data in Medical Research (3rd ed.). Chapman and Hall/CRC.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
R		R
man		man
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CRAN-SUBMISSION		CRAN-SUBMISSION
DESCRIPTION		DESCRIPTION
FastSurvival.Rproj		FastSurvival.Rproj
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastSurvival

Functions

Installation

Quick start

Design principles

Using the functions together

References

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FastSurvival

Functions

Installation

Quick start

Design principles

Using the functions together

References

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages