cmuts is a suite of programs for counting mutations and computing reactivity profiles from data generated via chemical probing experiments. It features
- Fast, compiled C++ code with native multithreading support
- Streamed IO and direct output to compressed HDF5 files
- Mutation-informed handling of ambiguous deletions
In a picture:
Detailed documentation can be found at hmblair.github.io/cmuts. You can also try cmuts directly in the browser via the Hugging Face Space.
# Clone the repository
git clone https://github.com/hmblair/cmuts.git
cd cmuts
# Install system dependencies (Ubuntu/Debian)
sudo apt-get install libhdf5-dev libhts-dev zlib1g-dev cmake autoconf samtools
# Optional: OpenMP for multithreaded pairwise counting
sudo apt-get install libomp-dev
# Build and install
./configure && pip install -e ".[dev]"For macOS with Homebrew:
# Required
brew install hdf5 htslib zlib cmake autoconf automake libtool samtools
# Optional: OpenMP for multithreaded pairwise counting
brew install libomp
# Build and install
./configure && pip install -e ".[dev]"# Count mutations from aligned BAM files
cmuts-core -b aligned.bam -f reference.fasta -o counts.h5
# Normalize counts to reactivity profiles
cmuts-normalize counts.h5 reference.fasta -o reactivity.h5
# Visualize results
cmuts-plot reactivity.h5 --output figure.pngimport cmuts
import h5py
# Load reactivity data
with h5py.File("reactivity.h5", "r") as f:
data = cmuts.ProbingData.load("combined", f)
# Access reactivity values
print(f"Mean reactivity: {data.reactivity.mean():.3f}")
print(f"Signal-to-noise: {data.snr.mean():.2f}")- High Performance: C++ core with OpenMP parallelization processes millions of reads efficiently
- MPI Support: Distribute workloads across compute nodes for large-scale analyses
- Streaming I/O: Process BAM/CRAM files without loading entire datasets into memory
- Flexible Normalization: Multiple normalization schemes (raw, percentile, outlier-based)
- Pairwise Analysis: Compute mutation correlations and mutual information matrices
cmuts/
├── src/cpp/ # C++ core (mutation counting, BAM/CRAM parsing)
├── src/python/cmuts/ # Python package (normalization, visualization)
├── tests/ # Test suites
└── docs/ # Documentation
See docs/architecture.md for detailed architecture documentation.
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/python -v
# Run linting
ruff check src/python/cmuts
mypy src/python/cmuts
# Install pre-commit hooks
pre-commit installMIT License - see LICENSE for details.
