STRAND is an R package designed for simulation and analysis of network data. The package can be used to simulate network data under stochastic block models, social relations models, and latent network models. These tools allow for simulation not only of realistic human and animal social networks, but also allow researchers to simulate the effects of potential biases—like respondents falsely reporting ties or failing to recall real ties—on network level properties. We also provide tools for Bayesian network-based diffusion analysis, downstream nodal regressions, and other network analysis protocols.
STRAND is focused mainly on users aiming to conduct network data analysis. Users can specify complex Bayesian social network models using a simple, lm-style, syntax. Single-sampled self report data can be modeled using stochastic block models or the Social Relations Model, with or with-out covariates. Double-sampled network data can be modeled using a latent network approach that accounts for inter-respondent disagreement. STRAND also provides methods for more rigorously dealing with missing data and measurement error. We have recently added support for multiplex network models, dimension reduction models, and longitudinal network models. Here we provide a brief overview of various STRAND workflows. For further details, see our full publications at:
- Psychological Methods (for latent network models)
- Journal of Animal Ecology (for simple network models)
- Methods in Ecology and Evolution (for censoring-robust methods in ecology).
- Royal Society Open Science (for multiplex network models).
- SocArXiv (for longitudinal network models).
STRAND is part of an ecosystem of tools for modern social network analysis. DieTryin is a companion package designed to facilitate the collection of RICH economic games, dyadic peer ratings, and roster-based network data in human communities. XLSFormulatoR is a package for automatically building name-generator network surveys for KoboToolbox.
Install by running on R:
################################### Install the latest release
library(devtools)
install_github('ctross/STRAND@beauty_in_the_dissonance')
library(STRAND)
You will need to have cmdstanr installed. We are also slowly building up NumPyro support for all of our models. STRAND will remain an R-based front-end, but the JAX back-end via NumPyro in Python allows our highly parameterized models to be fit with substantially shorter run-times (NumPyro models are typically 10x to 50x faster than Stan models).
Quickstart guides for Stan can be found here.
STRAND calls Stan models in the background, so you will need a C++ compiler in your toolchain. Users frequently rely on RTools. For NumPyro you will need Python, NumPyro, and JAX, and they will be called under-the-hood via reticulate from R.
Example models, with test data, can be found here. Note that analysis of large network data sets can be quite slow using MCMC. An old adage, however, is that we often spend months or years collecting our data, and so we should be happy waiting a few hours for our models to fit. Our tutorials below are designed to accomplish a few things:
- We teach users how to fit different kinds of network models using example data sets,
- We compare STRAND to other software packages for network analysis, and comment on similarities and differences
- We try to clarify any misconceptions potential users might have about Bayesian network analysis models---e.g, we show that Bayesian models with proper priors are always identifiable, that probit versus logit models are essentially equivalent, that STRAND's implementation of the SRM (Social Relations Model) is a measurement-error robust generalization of the original SRM model.
- One thing to note, is that we use short MCMC runs (of 500 warmup, 500 samples) of a single chain in many of the examples below, just so that the tutorial has a quicker run-time. When using these models on real data, we recommend using several chains and running them for at least 1000 warmup and 1000 samples. Always check the fit slot of the STRAND fit object, and then use base Stan or JAX summary functions to check rhat and effective sample size.
The most basic network analysis models are generalizations of the SRM that account for dyadic reciprocity (i.e., correlation in dyadic connections) and generalized reciprocity (i.e., correlations in nodal degree) through correlated dyadic and node-level random effects. Users can further account for focal, target, dyadic, and block-level covariates using simple lm-style syntax. The tutorial code below can be adapted as needed for users' own projects:
-
Binary/Bernoulli outcomes (using human friendship data from rural Colombia): Bernoulli models
-
Binomial outcomes (using data from baboons): Binomial models
-
Binomial outcomes with censoring bias (tested using simulated data): Binomial + ME models
-
Poisson outcomes (using data from vampire bats): Poisson models
-
Gaussian outcomes (using simulated data): Gaussian models
-
Undirected networks (tested using simulated data): Undirected networks
-
Deploy the SRM using NumPyro (tested using simulated data): How to fit models using NumPyro/JAX back-end
-
Deploy the SRM using NumPyro (emprical data): NumPyro for the Colombian Friendship Data
Networks typically influence each other. Users may wish to study if outgoing ties in one network layer are predictive of incoming ties in a different layer. To model such structure, we provide multiplex generalizations of the SRM for various types of outcomes. Users can still estimate the effects of focal, target, dyadic, and block predictors within each layer, while also estimating residual correlations in random effects within and across network layers at both a generalized and dyadic level (see our papers at Royal Society Open Science and Communications Psychology for a primer on these models).
-
Multiplex Binary/Bernoulli outcomes (using RICH economic games): Multiplex Bernoulli models
-
Multiplex Binomial outcomes (using baboon data): Multiplex Binomial models
-
Multiplex Poisson outcomes (using simulated data): Multiplex Poisson models
-
Multiplex Gaussian outcomes (using simulated data): Multiplex Gaussian models
-
Multiplex models with undirected layers (tested using simulated data): Undirected multiplex networks
-
Deploy the Multiplex SRM using NumPyro (tested using simulated data): How to fit multiplex models using NumPyro/JAX back-end
-
Deploy the Multiplex SRM using NumPyro (empirical data): NumPyro for the RICH games data
Sometimes we have multiple network layers of observations that are reflective of a single underlying social network. In these models, we use the SRM to estimate a single latent network, and then estimate loadings of the observation layers onto that latent network.
-
Multiplex dimension-reduction in Stan (using simulated data): Multiplex dimension reduction models
-
Deploy the Multiplex dimension-reduction model using NumPyro (tested using simulated data): How to fit multiplex dimension-reduction models using NumPyro/JAX back-end
STRAND now supports longitudinal network analysis. These models draw on a multiplex network analysis framework to study network evolution over time. Longitudinal network models can be thought of as multiplex models with additional symmetries that arise from assuming transportability of some effects across time-steps. See details in our SocArXiv preprint. These methods are currently passing the basic tests, but are considered experimental. We appreciate any feedback or test reports.
-
Longitudinal Binary/Bernoulli outcomes (using Colombian friendship data): Longitudinal Bernoulli models
-
Longitudinal Binomial outcomes (using baboon data): Longitudinal Binomial models
-
Longitudinal simulation analysis (using simulated data): Longitudinal Generative Simulations
-
Deploy the Longitudinal SRM using NumPyro (tested using simulated data): How to fit longitudinal models using NumPyro/JAX back-end
-
Deploy the Longitudinal SRM using NumPyro (using the baboon data): NumPyro for the baboon data
Respondents often disagree on ties. Alice might report giving to Bob, but Bob might not report receiving from Alice. To make sense of network data with such potential discordance, we integrate a reporting model into the SRM. Users can explicitly model the predictors of false positive reports, the recall rate of true ties, and inter-question duplication rates, as well as all of the standard predictors using lm-style syntax.
- Double-sampled binary outcomes (self-reported food/money sharing from rural Colombia): Latent network models
STRAND now supports automatic Bayesian imputation for continuous predictor variables, and automatically slices missing outcomes out of the likelihood. Block predictors are currently imputed deterministically prior to model fitting via columnwise resampling (a more rigorous method may be integrated in the future). These methods are currently passing the basic unit tests, but are considered experimental. We appreciate any feedback or test reports. Once these models go through a decent burn-in period, we will push this functionality to the base functions.
-
Binary/Bernoulli outcomes with missings: Single-layer models with imputation
-
Binomial outcomes with missings: Binomial + censoring models with imputation
-
Double-sampled latent network models with missings: Latent network models with imputation
-
Multiplex Poisson outcomes with missings: Multiplex models with imputation
-
Longitudinal Bernoulli outcomes with missings: Longitudinal models with imputation
Often, users are not only interested in networks as outcomes, but also want to know how nodal out-flow and in-flow propensities are linked to downstream nodal outcomes. To support this, we have added a function for running downstream nodal regressions. In this workflow, the user first fits a network model. Then, they run a downstream regression using the estimated node-level random effects as predictors of the downstream outcome. This two-step model propagates uncertainty by running a measurement error model on the nodal random effects. We are currently preparing a brief methods note on this model.
- Predicting downstream outcomes from nodal random effects: Downstream nodal regressions
How do networks structure the evolution of binary traits? A classic method here is NBDA (network-based diffusion analysis). We provide a generalized version of NBDA that incorporates focal, target, and dyadic predictors of social learning rate.
- Predicting the evolution of binary traits: Network-based diffusion analysis
Below we provide a few more pointed tutorials showing how to deal with specific issues, like structural zeros, prior specification, data simulation, and comparison of STRAND to other tools. We also show a few other things about STRAND models, e.g., probit and logit links yield equivalent inference, binary SRM models are well-specified in a Bayesian framework, etc. We will also include some minimum working examples to address some common questions we get via email.
-
An example on computing posterior distributions for network metrics like betweenness, centrality, transitivity, etc.: Posterior distributions for network metrics
-
An example on how the predicted network is affected if a mask layer is applied: Predicted Network Example
-
An example on both simulating and fitting networks (includes interactions): Simulating data and fitting interaction models
-
An example on both simulating and fitting multiplex networks: Simulating data and fitting multiplex models
-
An example on accounting for structural-zeros/censored-data. For example, there may be N groups of individuals in a dataset, where each group is in a separate enclosure, and thus only with-group ties can be modeled: Structural Zeros.
-
An example on interactions between focal (sender), target (receiver), and dyadic effect: Between-level interactions
-
An example on changing default priors: Changing priors
-
Probit versus logit links for binary outcomes: Does anything depend on the choice of link function?
-
Probit versus logit links for multiplex binary outcomes: Does anything depend on the choice of link function?
-
STRAND's Gaussian SRM has both dyad-level random effects and dyad-level error. Can we still recover parameters? Yes.
-
STRAND's probit/logit SRM has both dyad-level random effects and dyad-level error. Can we still recover parameters? Yes.
-
Multiplex network modeling requires estimation of a highly-structured dyadic correlation matrix with special symmetries. We have two approaches for constructing such block-structured matrices, one based on an
$\ell^{2}$ norm penalty, and one based on constructing a Cholesky factor with a special set of constraints. Here we show that both methods are equally effective: Different methods for estimating dyadic reciprocity
If you use STRAND, please cite us using the most relevant paper:
@article{ross2024modelling,
title={Modelling animal network data in R using STRAND},
author={Ross, Cody T and McElreath, Richard and Redhead, Daniel},
journal={Journal of Animal Ecology},
volume={93},
number={3},
pages={254--266},
year={2024},
publisher={Wiley Online Library}
}
@article{ross2025bayesian,
title={Bayesian multiplex network models in R using STRAND: Methods for biologists and social scientists},
author={Ross, Cody T and Kajokaite, Kotrina and Pinkney, Sean and Sosa, Sebastian},
journal={Royal Society Open Science},
volume={12},
number={10},
pages={250555},
year={2025},
publisher={The Royal Society}
}
@article{redhead2023reliable,
title={Reliable network inference from unreliable data: A tutorial on latent network modeling using STRAND},
author={Redhead, Daniel and McElreath, Richard and Ross, Cody T},
journal={Psychological methods},
year={2023},
volume={29},
number={6},
pages={1100--1122},
publisher={American Psychological Association}
}
@article{sosa2024robust,
author = {Sosa, Sebastian and McElreath, Mary B. and Redhead, Daniel and Ross, Cody T.},
title = {Robust Bayesian analysis of animal networks subject to biases in sampling intensity and censoring},
journal = {Methods in Ecology and Evolution},
volume = {},
number = {},
pages = {1--22},
doi = {https://doi.org/10.1111/2041-210X.70017}
}
Each of the models included in this package have been fit to real empirical datasets, and tested across a wide-range of simulated data to ensure their quality. However, some models are still rather new. If you come across any weird behavior, or notice any bugs, typos, or other problems, please open an issue on GitHub, and we will work to address it!
Additionally, our settings are such that anyone can issue pull requests. If you notice any typos in the documentation, want to add functionality, or feel like you can add something else useful, please submit a pull request with your proposed changes. We will inspect the changes closely and integrate them if they are helpful. Publicly opening issues for your questions or concerns helps the whole community.
