| title | output |
|---|---|
README |
github_document |
ComBatSuite is built upon the original sva package found on Bioconductor. ComBatSuite expands the ComBat tools available to various data sets other than sequencing data (including methylation data, single cell data, and microbiome data).
The ComBatSuite package contains functions for removing batch effects and
other unwanted variation in high-throughput experiments. Specifically, the
ComBatSuite package contains functions for identifying and building surrogate
variables for high-dimensional data sets. Surrogate variables are covariates
constructed directly from high-dimensional data (like gene expression/RNA
sequencing/methylation/brain imaging data) that can be used in subsequent
analyses to adjust for unknown, unmodeled, or latent sources of noise.
The ComBatSuite package can be used to remove artifacts in two ways:
- Identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments
- directly removing known batch effects using ComBat [@johnson:2007aa] and its various versions.
To install the most up-to-date version of ComBatSuite, please install directly from github. You will need the devtools package. You can install both of these with the following commands:
if (!require("devtools", quietly = TRUE)) {
install.packages("devtools")
}
library(devtools)
install_github("wejlab/ComBatSuite")The ComBatSuite package includes multiple different methods created by different faculty and students, including the original sva package. It would really help them out if you would cite their work when you use this software.
To cite the overall ComBatSuite package please cite the original sva package and the updated ComBatSuite package:
- Leek JT, Johnson WE, Parker HS, Jaffe AE, and Storey JD. (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics DOI:10.1093/bioinformatics/bts034
When using specific methods, please cite those methods as well:
For sva please cite:
- Leek JT and Storey JD. (2008) A general framework for multiple testing dependence. Proceedings of the National Academy of Sciences , 105: 18718-18723.
- Leek JT and Storey JD. (2007) Capturing heterogeneity in gene expression studies by `Surrogate Variable Analysis'. PLoS Genetics, 3: e161.
For ComBat please cite:
- Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8 (1), 118-127
For mean-only or reference-batch ComBat please cite:
- Zhang, Y., Jenkins, D. F., Manimaran, S., Johnson, W. E. (2018). Alternative empirical Bayes models for adjusting for batch effects in genomic studies. BMC bioinformatics, 19 (1), 262.
- Zhang, Y., Parmigiani, G., Johnson, W. E., ComBat-seq: batch effect adjustment for RNA-seq count data, NAR Genomics and Bioinformatics, Volume 2, Issue 3, September 2020, lqaa078, https://doi.org/10.1093/nargab/lqaa078
For svaseq please cite:
- Leek JT (2014) svaseq: removing batch and other artifacts from count-based sequencing data. Nucleic Acids Res. doi: 10.1093/nar/gku864.
For supervised sva please cite:
- Leek JT (2014) svaseq: removing batch and other artifacts from count-based sequencing data. Nucleic Acids Res. doi: 10.1093/nar/gku864.
- Gagnon-Bartsch JA, Speed TP (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13:539-52.
For fsva please cite:
- Parker HS, Bravo HC, Leek JT (2013) Removing batch effects for prediction problems with frozen surrogate variable analysis arXiv:1301.3947
For psva please cite:
- Parker HS, Leek JT, Favorov AV, Considine M, Xia X, Chavan S, Chung CH, Fertig EJ (2014) Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction Bioinformatics doi: 10.1093/bioinformatics/btu375
For ComBat-met please cite:
- Wang J (2025) ComBat-met: adjusting batch effects in DNA methylation data. NAR Genomics and Bioinformatics, 7 (2), lqaf062. doi: 10.1093/nargab/lqaf062