fix: handle Stan tuple and complex types by avehtari · Pull Request #1174 · stan-dev/cmdstanr

avehtari · 2026-04-08T11:49:16Z

Closes #925

Stan examples by me, @spinkney and @WardBrian (from https://discourse.mc-stan.org/t/proof-of-concept-binary-output-format-for-cmdstan/40846/67). Careful iterative prompting by me. Code , tests and PR description assisted by Claude.

Fix tuple and complex variable handling

Summary

CmdStan 2.38+ introduced tuple and complex types, which use new naming
conventions in CSV output that CmdStanR did not understand. This PR adds
full support for these types across metadata parsing, model methods, and
init value handling.

Fix repair_variable_names() and variable_dims() to handle complex
suffixes (.real/.imag) and tuple separators (:) in CSV column names
Fix variable_skeleton() and unconstrain_draws() for models with
tuple and complex parameters
Add tuple init support: mod$sample(init = fit) now correctly passes
tuple parameter values as nested JSON objects
Add write_stan_json() support for tuple values as named lists

Problem

CmdStan CSV headers now use three different separators:

Separator	Meaning	Example
`.` between digits	array/matrix index	`beta.1.2` → `beta[1,2]`
`:`	tuple element	`b_tuple:1:1.2` → `b_tuple:1:1[2]`
`.real` / `.imag`	complex part	`z.real` → `z[real]`

These can combine: arr_pair.1:1 (array + tuple), z3D.1.1.1.real
(array + complex), nested:2:2.real (tuple + complex).

Before this PR, the following issues occurred:

Warning: NAs introduced by coercion from variable_dims() on
every model with complex or tuple types — as.numeric("imag") returns
NA.
Wrong metadata: fit$metadata()$stan_variable_sizes contained NA
for complex variables and incorrect dimensions.
Broken variable_skeleton(): Returned NA names for tuple
parameters because create_skeleton() couldn't match Stan-level names
("b_tuple") against C++ leaf names ("b_tuple.1.1").
Crashing unconstrain_draws(): Tuple parameters were silently
dropped from the draw subset (Stan-level name "b_tuple" not found in
leaf names "b_tuple:1:1"), causing CmdStan to receive the wrong number
of scalars.
Missing init values: mod$sample(init = fit) dropped tuple
parameters entirely, and write_stan_json() could not serialize tuple
values (heterogeneous lists crashed list_to_array()).

Changes

`R/csv.R`

repair_variable_names() — Detects and strips .real/.imag suffixes
before the dot-to-bracket conversion, then re-attaches them in the correct
position. The : tuple separator passes through unchanged.

variable_dims() — For non-numeric indices ("real", "imag", tuple
indices like "1:2"), counts unique values across all entries for that
dimension position instead of calling as.numeric().

`R/utils.R`

New helper functions for bridging between Stan-level names and leaf names:

stan_param_has_leaf() — Checks if Stan-level names have matching
leaf names using ":" prefix matching.
expand_stan_params_to_leaves() — Expands Stan-level names to
their leaf equivalents for posterior::subset_draws().
is_tuple_type() — Detects tuple parameters from model_variables
type info ($type is a list for tuples).
build_tuple_init_value() — Recursively reconstructs a nested
named-list init value from flat leaf draws, also validating for NA/Inf.
.extract_draw_value() — Shared helper for the draw extraction
pipeline.

create_skeleton() — Expands tuple Stan-level names to leaf components
from param_metadata_ using "." prefix matching.

`R/fit.R`

unconstrain_draws() and unconstrain_variables() — Use
stan_param_has_leaf() for the zero-length parameter check and
expand_stan_params_to_leaves() for draw subsetting.

`R/args.R`

validate_fit_init() — Uses stan_param_has_leaf() instead of %in%.

process_init.draws() — Separates parameters into tuple and non-tuple.
For tuples, uses build_tuple_init_value() to reconstruct nested init
values. NA/Inf validation is done inline during extraction.

`R/data.R`

write_stan_json() — Detects tuple-style named lists (keys "1",
"2", ...) and processes them recursively instead of calling
list_to_array().

New helpers: is_tuple_list(), prepare_tuple_for_json().

Tests

tests/testthat/resources/stan/tuple_complex.stan — Test model with
tuple and complex types as both parameters and generated quantities:
nested tuples, tuples with complex, arrays of tuples, complex vectors,
complex matrices, arrays of complex matrices.

tests/testthat/test-tuple-complex.R — 137 tests covering:

Unit tests for repair_variable_names, unrepair_variable_names,
variable_dims with all tuple/complex name patterns
Helper function tests (stan_param_has_leaf, expand_stan_params_to_leaves)
write_stan_json with tuple values
build_tuple_init_value reconstruction
End-to-end: sampling with no warnings, correct metadata, variable_skeleton(),
unconstrain_draws(), init = fit round-trip, manual init lists

Test plan

All 397 existing test-csv.R tests pass
All 72 existing test-data.R tests pass
All 15 existing test-fit-init.R tests pass
137 new test-tuple-complex.R tests pass
Manual verification with three Stan models covering: scalar/array/matrix
params, complex scalar/vector/matrix/array params, simple/nested/arrayed
tuples, tuples as parameters and generated quantities

Copyright and Licensing

Please list the copyright holder for the work you are submitting
(this will be you or your assignee, such as a university or company):

Aki Vehtari

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

codecov-commenter · 2026-04-08T13:36:59Z

Codecov Report

❌ Patch coverage is 95.20548% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.23%. Comparing base (801b2b4) to head (db15c82).
⚠️ Report is 8 commits behind head on master.

Files with missing lines	Patch %	Lines
R/utils.R	93.61%	3 Missing ⚠️
R/args.R	95.00%	2 Missing ⚠️
R/data.R	88.23%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #1174    +/-   ##
========================================
  Coverage   91.23%   91.23%            
========================================
  Files          15       15            
  Lines        6070     6173   +103     
========================================
+ Hits         5538     5632    +94     
- Misses        532      541     +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jgabry · 2026-04-08T21:13:13Z

Thanks Aki! This one might take me some time to review.

jgabry

I haven't reviewed the code yet but the failure on WSL is because

Error: Additional model methods are not currently available with WSL CmdStan and will not be compiled

Adding skip_if(os_is_wsl()) to those specific tests should fix it.

EDIT: oops I meant to "request changes" not "approve" yet

jgabry

I haven't reviewed the code yet but the failure on WSL is because a few of them error with

Error: Additional model methods are not currently available with WSL CmdStan and will not be compiled

Adding skip_if(os_is_wsl()) to those specific tests should fix it.

WardBrian · 2026-04-09T14:25:03Z

Just to add a note, it would be nice if some of these IO utilities were moved somewhere independent, so we could use them from e.g. bridgestan or similar (I know that having more packages is more CRAN pain...)
stan-dev/stanio#7

jgabry · 2026-04-14T18:06:20Z

I still haven't had a chance to go through this myself yet (I definitely will at some point), but I asked codex to take a look to see if it could find anything that's clearly an issue:

Comment on tuple detection in write_stan_json():

I don't think write_stan_json() can safely infer "tuple" from names alone here. This function has
historically treated lists as array-like containers, and it does not have model-type information to
tell whether a sequentially named list is actually meant to be a tuple.

For example, write_stan_json(list(x = list("1" = 1:3, "2" = 4:6)), ... ) used to serialize x as an
array, but with this change it becomes a JSON object with "1" and "2" keys. That looks like a
breaking change for existing code that happens to produce named lists via split(), setNames(), or
subsetting.

Other than backwards compatibility, I don't think those names add useful semantics for Stan arrays
anyway. If we want tuple serialization here, I think it needs to be driven by model metadata or an
explicit opt-in, not by a name heuristic.

Comment asking for a regression test around the ambiguity:

Can we add a regression test for the ambiguous non-tuple case too? Right now the new tests cover tuple
serialization, but they don't cover the case where a sequentially named list should still behave like
an array.

For example, I'd want write_stan_json(list(x = list("1" = 1:3, "2" = 4:6)), ...) to keep matching
the existing array behavior unless we have explicit type information saying x is a tuple. Without
that test, it's easy to lock in the new heuristic and accidentally break old callers.

Comment on tuple root variable filtering:

I think tuple support is still incomplete in the variable-filter path. The helpers used by
fit$draws(), fit$print(), and read_cmdstan_csv(..., variables = ...) still only expand name[
prefixes, so tuple roots like b_tuple, pair, or nested won't resolve to their name:... leaves.

For example, if the CSV contains variables like b_tuple:1:1[1], b_tuple:1:1[2], and
b_tuple:2[1,1], I'd expect fit$draws(variables = "b_tuple") to return those columns, just like
fit$draws(variables = "theta") expands to theta[1], theta[2], etc. But with the current matching
logic, "b_tuple" is treated as not found.

I'd update the same prefix-matching logic here to treat name: as another expansion case.

I haven't verified yet that these are all actually issues. But if 1 is true then I guess we could break backwards compatibility here because we're going to do v1.0. We could say that any list with names will be interpreted as a tuple. It's not ideal, but it's doable if there's not another solution.

fix: handle Stan tuple and complex types

270970a

avehtari requested a review from jgabry April 8, 2026 11:49

Merge branch 'master' into fix-925-handle-tuple-and-complex-types

f826a11

jgabry mentioned this pull request Apr 8, 2026

NA dimensions for complex and tuple types #925

Open

jgabry approved these changes Apr 8, 2026

View reviewed changes

jgabry self-requested a review April 8, 2026 21:19

jgabry requested changes Apr 8, 2026

View reviewed changes

test: skip two model methods tests if os_is_wsl()

db15c82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle Stan tuple and complex types#1174

fix: handle Stan tuple and complex types#1174
avehtari wants to merge 3 commits intomasterfrom
fix-925-handle-tuple-and-complex-types

avehtari commented Apr 8, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 8, 2026 •

edited

Loading

Uh oh!

jgabry commented Apr 8, 2026

Uh oh!

jgabry left a comment •

edited

Loading

Uh oh!

jgabry left a comment •

edited

Loading

Uh oh!

WardBrian commented Apr 9, 2026

Uh oh!

jgabry commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

avehtari commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix tuple and complex variable handling

Summary

Problem

Changes

R/csv.R

R/utils.R

R/fit.R

R/args.R

R/data.R

Tests

Test plan

Copyright and Licensing

Uh oh!

codecov-commenter commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jgabry commented Apr 8, 2026

Uh oh!

jgabry left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jgabry left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WardBrian commented Apr 9, 2026

Uh oh!

jgabry commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

avehtari commented Apr 8, 2026 •

edited

Loading

`R/csv.R`

`R/utils.R`

`R/fit.R`

`R/args.R`

`R/data.R`

codecov-commenter commented Apr 8, 2026 •

edited

Loading

jgabry left a comment •

edited

Loading

jgabry left a comment •

edited

Loading