A code formatter for R, built on R's parser. Formatting logic is implemented in both base R and C++ (via Rcpp) — the C++ path runs automatically and is ~85x faster; the R implementation serves as a readable reference and fallback.
rformat uses parse() and getParseData() to make formatting decisions
from the token stream and expression structure, not from regex or
indentation heuristics. All transforms operate on an enriched token
vector (C++) or DataFrame (R).
remotes::install_github("cornball-ai/rformat")library(rformat)
# Format a string
rformat("x<-1+2")
#> x <- 1 + 2
# Format a file (overwrites in place)
rformat_file("script.R")
# Format all R files in a directory
rformat_dir("R/")
# Dry run
rformat_file("script.R", dry_run = TRUE)rformat("f=function(x,y){
if(x>0)
y=mean(x,na.rm=TRUE)
else y=NA
}")f <- function(x, y) {
if (x > 0)
y <- mean(x, na.rm = TRUE)
else y <- NA
}- Normalizes spacing around operators, commas, and keywords
- Indents by syntactic nesting depth
- Converts
=to<-for assignment (where the parser confirmsEQ_ASSIGN, notEQ_SUB) - Wraps long lines at logical operators and commas
- Wraps long function signatures with continuation indent
- Collapses short multi-line calls back to one line
- Preserves comments and strings exactly
- Removes trailing whitespace and excess blank lines
- Optionally adds braces to bare control-flow bodies
- Optionally expands inline if-else to multi-line
| Parameter | Default | Description |
|---|---|---|
indent |
4L |
Spaces per level, or a string like "\t" |
line_limit |
80L |
Line width before wrapping |
wrap |
"paren" |
"paren" aligns to (, "fixed" uses 8-space continuation |
brace_style |
"kr" |
"kr": ){ same line. "allman": { on its own line |
control_braces |
FALSE |
Add braces to bare control-flow bodies |
expand_if |
FALSE |
Expand all inline if-else to multi-line |
else_same_line |
TRUE |
Join }\nelse to } else |
function_space |
FALSE |
Space before ( in function(x) |
Defaults are derived from analysis of the 30 packages that ship with R.
Parse preservation. If input parses, output parses. Token types and ordering are preserved. Strings and comments are never modified.
Semantic preservation. Only whitespace and style tokens change.
Assignment conversion and brace insertion are guided by parser token
types (EQ_ASSIGN vs EQ_SUB, structural body detection), so they
never change meaning.
Idempotency. rformat(rformat(x)) == rformat(x). Verified across
126 CRAN and base R packages with randomized parameter combinations
(indent, wrap, brace_style, control_braces, line_limit, etc.):
0 failures, 0 idempotency exceptions.
The stress test suite
formats every .R file from 126 packages (base, recommended, and
popular CRAN), checking that formatted code parses and that formatting
twice produces identical output. Tests run with randomized style
parameters to exercise all option combinations.
The formatting pipeline has two implementations that produce identical output:
- R (
R/ast_*.R): Pure base R reference implementation. No compilation needed; readable source for understanding the algorithms. - C++ (
src/*.cpp): Rcpp fast path. Same algorithms, ~85x faster on typical files. Used automatically.
Both operate on the same token stream from parse() + getParseData():
enrich terminals with nesting depth, run transforms (collapse, wrap,
braces, etc.), then serialize back to text.
GPL-3