SurveyLand is an R Shiny application developed by the Division of Research and Methodology at the National Center for Health Statistics (NCHS) to improve the efficiency, transparency, and reproducibility of survey data workflows, while making insights more accessible. It provides a user-friendly, stepwise workflow for preparing survey data, specifying complex survey design features, generating tables and plots, and exporting presentation-ready outputs aligned with NCHS data presentation standards.
Although SurveyLand can be used with a wide range of datasets, it’s designed to support the unique requirements of complex survey data analysis, including clustering, stratification, and weighting. It is intended for users who need design-based estimates and:
- outputs aligned with NCHS data presentation standards
- quick export to Word, Excel, or image formats
- a guided interface for preparing and analyzing data
This project is still in development. Features and documentation are subject to change.
Supported file formats
| Format | Extension(s) |
|---|---|
| Comma-separated values | .csv |
| Excel | .xlsx, .xls |
| SAS | .sas7bdat |
| SPSS | .sav |
| R data | .RData, .rdata, .Rda, .rda, .rds, .RDS, .Rds |
Files up to 300 MB are supported.
Enter survey metadata that auto-populates table titles, figure titles, and data source captions, including: data producer, survey name, survey round or cycle, field dates, and geographic area.
Filtering data by up to two variables before analysis.
Specify survey design features including cluster/PSU, strata, and weight variables, if applicable, to account for complex survey design.
Three analysis types are supported:
| Type | Description |
|---|---|
| One-way (single-variable) | Distribution or summary statistics for a single outcome variable |
| Two-way (bi-variable) | Cross-tabulation of an outcome variable by a covariate |
| Multivariable | Distributions across multiple variables sharing the same response options |
Generate tabulated, formatted, and rounded survey estimates for one-way,
two-way, and multivariable analyses as flextable objects.
Variable type detection
SurveyLand automatically determines whether each selected variable is continuous or categorical. A variable is treated as continuous if it is numeric, not labelled, and has more than 10 unique non-missing values. Otherwise, it is treated as categorical. This affects the type of table output generated:
- Continuous variables: tables display summary statistics, including the mean of known values (Mean), standard error of the mean (SEM), and standard deviation (SD)
- Categorical variables: tables display the estimated percentage (Percent) and standard error (SE)
Table formatting conventions:
Optionally, surpress low-precision estimates according to the NCHS data
presentation standards using the
surveytable package, which
implements the methods described in Parker et
al. (2017) and
Parker et al. (2023)
| Symbol | Meaning |
|---|---|
* |
Estimate does not meet NCHS standards of reliability and has been suppressed |
0.0 |
Quantity is greater than zero but less than 0.05 |
--- |
Data not available (missing) |
All tables include a footnote with: the SE definition, applicable symbol definitions, total number of complete cases (n), a rounding note for categorical tables, and the data source caption from Step 2.
Export options: PNG image, Word document (.docx), Excel workbook (.xlsx)
Generate bar charts for one-way, two-way, and multivariable analyses with options for customization, including:
- custom text labels -value labels
- Multiple
ggplot2themes, and a custom NCHS-style theme - Axis orientation (horizontal/vertical) and bar style (stacked or side-by-side)
Export option: PNG image
Generate and download formatted Word document report that contains selected plots and survey metadata using Quarto.
- Upload a supported data file
- Enter survey metadata for titles and data source captions
- Optionally filter the dataset
- Specify survey design and weighting variables
- Choose an analysis type
- Generate tables and/or plots
- Export outputs or add plots to a report
- Render and download the report
1. Install Quarto CLI
Report generation requires the Quarto command-line tools. Download and install before running the application.
2. Install R packages
install.packages(c(
"shiny", "shinyFeedback", "shinyjs", # Shiny and UI utilities
"haven", "readxl", # Data import
"tidyverse", "glue", "sjlabelled", "labelled", # Data wrangling
"survey", "srvyr", "surveytable", # Survey analysis
"DT", "flextable", # Tables and visualization
"quarto", "officer", "openxlsx" # Export and reporting
))If app.R is in your working directory, launch the application with:
shiny::runApp("app.R")Alternatively, open app.R in RStudio and click Run App.
Once launched, click the User Guide button for step-by-step
instructions. This requires that docs/user-guide.html exists in the
expected location.
The files below are required for full functionality:
├── app.R # Main Shiny application
├── report.qmd # Quarto template for Word report generation
├── docs/
│ └── user-guide.html # HTML user guide for the in-app modal
└── README.md
Depending on your repository layout,
app.Rmay be in a subfolder such asshiny/.
SurveyLand was piloted using data from the NCHS Research and Development Survey (RANDS), a survey that employs a complex design incorporating both probability and non-probability samples.
Authors
- Kristen Cibelli Hibben, CDC/NCHS/DRM/CCQDER (kcibelli@cdc.gov)
- Sarah Forrest, CDC/NCHS/DRM/OD (sforrest@cdc.gov)
- Paul Scanlon (former CDC/NCHS/DRM/CCQDER)
- Zachary Smith (former CDC/NCHS/DRM/CCQDER)
Acknowledgements
- CDC Data Science Upskilling (DSU) Program
- Alex Strashny, CDC/NCHS/DHCS/OD (author of
surveytable)
If you use SurveyLand in your research, please include the following statement in the Methods section of your article or report:
Data analyses were performed using the R Shiny application “SurveyLand.”
Please cite this software as follows:
Cibelli Hibben K, Forrest S, Scanlon P, Smith Z (2026). SurveyLand: An R Shiny application to streamline the analysis and reporting of complex survey data. National Center for Health Statistics. https://github.com/CDCgov/SurveyLand.