sRNAtlas

Comprehensive Small RNA-seq Analysis Platform

Features • Installation • Quick Start • Documentation • Citation

About

sRNAtlas is an open-source bioinformatics platform designed to democratize small RNA sequencing analysis. It provides researchers—regardless of computational expertise—with a complete, reproducible pipeline for analyzing miRNAs, siRNAs, piRNAs, and other small non-coding RNAs.

The platform integrates established bioinformatics tools (Bowtie, Samtools, Cutadapt, pyDESeq2) into a cohesive workflow, accessible through an intuitive web interface built with Streamlit. From raw FASTQ files to differential expression results and pathway enrichment, sRNAtlas handles the entire analysis pipeline while maintaining full transparency and reproducibility.

Key principles:

Accessibility: No command-line expertise required
Transparency: All parameters visible and adjustable
Reproducibility: Consistent, version-controlled analysis
Open source: Free to use, modify, and distribute under AGPL-3.0-or-later

Overview

sRNAtlas is a powerful, user-friendly application for comprehensive small RNA sequencing (sRNA-seq) data analysis. Built with Streamlit, it provides an intuitive interface for researchers to process raw sequencing data through quality control, alignment, quantification, differential expression analysis, and functional enrichment—all without requiring command-line expertise.

Why sRNAtlas?

🎯 Purpose-built for small RNA: Optimized parameters for miRNA, siRNA, piRNA analysis
🖥️ No coding required: Intuitive web interface for all analysis steps
🔬 Complete pipeline: From raw FASTQ to publication-ready figures
📊 Interactive visualizations: Explore your data with dynamic plots
💾 Project management: Save, load, and share analysis sessions
🧪 Reproducible: Consistent results with version-controlled parameters

Features

📋 Analysis Pipeline

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Raw       │───▶│   Quality   │───▶│   Adapter   │───▶│  Reference  │
│   FASTQ     │    │   Control   │    │   Trimming  │    │  Database   │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                                                │
┌─────────────┐    ┌─────────────┐    ┌─────────────┐           │
│  Functional │◀───│     DE      │◀───│    Read     │◀──────────┘
│  Enrichment │    │  Analysis   │    │  Counting   │
└─────────────┘    └─────────────┘    └─────────────┘

Core Modules

Module	Description	Key Features
📁 Project	Organize your analysis	Sample management, metadata, save/load projects
📊 Quality Control	Assess read quality	Size distribution, quality scores, contamination check
✂️ Trimming	Remove adapters	Cutadapt integration, preset adapters, length filtering
🗄️ Databases	Reference management	miRBase, RNAcentral, custom FASTA, index building
🔗 Alignment	Map reads	Bowtie optimization for small RNA, multi-mapper handling
🔬 Post-Align QC	Alignment quality	Mapping stats, strand bias, 5' nucleotide analysis
📈 Counting	Quantification	Feature counting, count matrix generation
🧬 DE Analysis	Differential expression	pyDESeq2, volcano plots, heatmaps, PCA
🔍 Novel miRNA	Discovery	Identify unannotated small RNAs
🧫 isomiR	Variant analysis	Detect isoforms, differential usage, arm switching
🎯 Targets	Target prediction	psRNATarget (plants), miRanda (animals)
🧬 GO/Pathway	Enrichment	Gene Ontology, KEGG pathway analysis
⚡ Batch	Automation	Full pipeline batch processing
📋 Reports	Export	HTML reports, figure export

Supported RNA Types

RNA Type	Size Range	Characteristics
miRNA	18-25 nt	Gene expression regulators, 5' U bias
siRNA	20-24 nt	RNAi pathway, perfect complementarity
piRNA	24-32 nt	Transposon silencing, germline
tRF/tsRNA	14-40 nt	tRNA-derived fragments, stress response
rsRF	15-40 nt	rRNA-derived fragments
snoRNA	60-300 nt	Small nucleolar RNA, rRNA modification
snRNA	100-300 nt	Small nuclear RNA, splicing
Y RNA	80-120 nt	DNA replication, quality control

Plant-specific: tasiRNA, phasiRNA, natsiRNA, hc-siRNA

Advanced Features

🔬 Novel miRNA Discovery: Identify unannotated small RNAs from unaligned reads
🧫 isomiR Analysis: Detect 5'/3' variants, SNPs, and non-templated additions
↔️ Arm Switching Detection: Identify 5p/3p dominance changes between conditions
🔄 Differential isomiR Usage: Compare isomiR ratios across experimental groups
📊 Multi-group Comparison: ANOVA for >2 conditions with pairwise comparisons
🔥 Cluster Analysis: Hierarchical clustering with interactive heatmaps
📈 Interactive Plots: Zoom, pan, and export publication-ready figures

New in v1.4.0

⚡ Performance Caching: Streamlit caching for faster repeat analyses
🎯 QC Scorecard: Traffic-light quality assessment with outlier detection
🔢 Multi-mapper Modes: Unique, fractional, and primary alignment counting
🐍 miRanda Integration: Animal miRNA target prediction
📋 Provenance Tracking: YAML/JSON export for full reproducibility
🔍 Multi-sample QC Overlays: PCA-based sample clustering and outlier detection

Installation

Prerequisites

Python 3.9 or higher
8+ GB RAM recommended
External tools: Bowtie, Samtools, Cutadapt

Option 1: Conda (Recommended)

# Clone the repository
git clone https://github.com/osman12345/sRNAtlas.git
cd sRNAtlas

# Create conda environment
conda create -n srnatlas python=3.10 -y
conda activate srnatlas

# Install bioinformatics tools
conda install -c bioconda bowtie samtools -y

# Install Python dependencies
pip install -r requirements.txt

Option 2: pip + Manual Tools

# Clone and setup
git clone https://github.com/osman12345/sRNAtlas.git
cd sRNAtlas

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or: venv\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

# Install external tools separately
# See docs/INSTALLATION.md for platform-specific instructions

Option 3: Docker

# Build the image
docker build -t srnatlas .

# Run the container
docker run -p 8501:8501 -v $(pwd)/data:/app/data srnatlas

# Or use docker-compose
docker-compose up -d

Verify Installation

# Check Python packages
python -c "import streamlit; import pandas; import pysam; print('OK')"

# Check external tools
bowtie --version
samtools --version
cutadapt --version

# Run tests
python -m pytest

Quick Start

1. Launch the Application

cd sRNAtlas
streamlit run app/main.py

Open your browser to http://localhost:8501

2. Create a Project

Click Project in the sidebar
Enter project name and select organism
Upload your FASTQ files

3. Run the Pipeline

Step	Module	Action
1	Quality Control	Assess raw read quality
2	Trimming	Remove adapters (select preset)
3	Databases	Download miRBase + build index
4	Alignment	Map reads to reference
5	Post-Align QC	Verify alignment quality
6	Counting	Generate count matrix
7	DE Analysis	Compare conditions
8	GO/Pathway	Functional enrichment

4. Export Results

Download count matrices, DE results, and figures
Generate HTML reports
Save project for future analysis

📖 See the Quick Start Guide for a detailed walkthrough.

Documentation

Document	Description
Quick Start Guide	Get running in 10 minutes
User Guide	Complete documentation
Installation Guide	Platform-specific setup

In-App Help

The application includes comprehensive documentation accessible via the Help module:

Quick start tutorials
Module reference
File format specifications
FAQ and troubleshooting

System Requirements

Hardware

Component	Minimum	Recommended
RAM	8 GB	16+ GB
Storage	20 GB	100+ GB
CPU	4 cores	8+ cores

Software

Dependency	Version	Purpose
Python	3.9+	Runtime
Bowtie	1.3+	Alignment (not Bowtie2)
Samtools	1.17+	BAM processing
Cutadapt	4.0+	Adapter trimming

Python Packages

streamlit>=1.28.0
streamlit-option-menu>=0.3.6
pandas>=2.0.0
numpy>=1.24.0
plotly>=5.18.0
pysam>=0.22.0
biopython>=1.81
scipy>=1.11.0
scikit-learn>=1.3.0
pydeseq2>=0.4.0

Input/Output Formats

Input Files

FASTQ (raw reads):

@SEQ_ID
TAGCTTATCAGACTGATGTTGA
+
IIIIIIIIIIIIIIIIIIIII

Count Matrix (CSV):

,Sample1,Sample2,Sample3,Sample4
hsa-miR-21-5p,1500,1450,2800,2750
hsa-miR-155-5p,200,180,650,620

Sample Metadata (CSV):

sample,condition,batch
Sample1,control,1
Sample2,control,1
Sample3,treatment,2
Sample4,treatment,2

Output Files

Output	Format	Description
Trimmed reads	FASTQ.gz	Adapter-free, filtered reads
Alignments	BAM/BAI	Sorted, indexed alignments
Count matrix	CSV	Read counts per feature
DE results	CSV	log2FC, p-value, FDR
Figures	PNG/HTML	Interactive visualizations
Reports	HTML	Comprehensive analysis summary

Configuration

Alignment Parameters (Bowtie)

Parameter	Default	Description
`-v`	1	Mismatches allowed (0-3)
`-k`	10	Report up to k alignments
`--best`	On	Report best alignments first
`--strata`	On	Only report best stratum

Trimming Parameters (Cutadapt)

Parameter	Default	Description
Min length	18 nt	Discard shorter reads
Max length	35 nt	Discard longer reads
Quality cutoff	20	Trim low-quality bases
Error rate	0.1	Adapter matching tolerance

DE Analysis Parameters

Parameter	Default	Description
FDR threshold	0.05	Significance cutoff
log2FC threshold	0.585	~1.5-fold change
Min count	10	Filter low-count features

Testing

sRNAtlas includes a comprehensive test suite:

# Run all tests
python -m pytest

# Run with verbose output
python -m pytest -v

# Run specific test file
python -m pytest tests/test_file_handlers.py

# Run with coverage
python -m pytest --cov=utils --cov=modules

Test Coverage

Module	Tests	Status
File handlers	12	✅ Passing
Progress tracker	5	✅ Passing
Cluster analysis	6	✅ Passing
Novel miRNA	8	✅ Passing
isomiR	19	✅ Passing
Total	51	All Passing

Project Structure

sRNAtlas/
├── app/
│   └── main.py              # Application entry point
├── modules/
│   ├── qc_module.py         # Quality control
│   ├── trimming_module.py   # Adapter trimming
│   ├── alignment_module.py  # Read alignment
│   ├── counting_module.py   # Read counting
│   ├── de_module.py         # Differential expression
│   ├── novel_mirna_module.py # Novel miRNA discovery
│   ├── isomir_module.py     # isomiR analysis
│   └── ...
├── utils/
│   ├── file_handlers.py     # File I/O utilities
│   ├── plotting.py          # Visualization functions
│   ├── cluster_analysis.py  # Clustering utilities
│   ├── caching.py           # Streamlit caching helpers
│   ├── error_handling.py    # Error classification & diagnostics
│   ├── qc_scorecard.py      # QC scoring with thresholds
│   ├── provenance.py        # Reproducibility tracking
│   ├── miranda.py           # miRanda target prediction
│   └── ...
├── config/
│   └── settings.py          # Configuration
├── tests/
│   ├── conftest.py          # Test fixtures
│   └── test_*.py            # Test files
├── docs/
│   ├── QUICK_START.md
│   ├── USER_GUIDE.md
│   └── INSTALLATION.md
├── assets/
│   └── logo.svg             # Application logo
├── requirements.txt
└── README.md

Troubleshooting

Common Issues

Problem	Cause	Solution
0 reads after trimming	Wrong adapter	Try different preset or check library kit
0% alignment rate	Wrong reference	Verify organism, rebuild index
Memory error	Too many samples	Process in smaller batches
`bowtie: not found`	Not in PATH	Install via conda or add to PATH
`pysam` import error	Missing htslib	`conda install -c bioconda pysam`

Getting Help

Check the in-app Help module
Review User Guide
Search existing GitHub Issues
Open a new issue with:
- Error message
- Steps to reproduce
- System information

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting pull requests.

Development Setup

# Clone and setup
git clone https://github.com/osman12345/sRNAtlas.git
cd sRNAtlas

# Create dev environment
conda create -n srnatlas-dev python=3.10 -y
conda activate srnatlas-dev

# Install dev dependencies
pip install -r requirements.txt
pip install pytest pytest-cov black flake8

# Run tests
python -m pytest -v

Code Style

Follow PEP 8 guidelines
Use type hints where appropriate
Write docstrings for functions and classes
Add tests for new features

Roadmap

Citation

If you use sRNAtlas in your research, please cite:

@software{sRNAtlas,
  author = {Ayman Osman},
  title = {sRNAtlas: A Comprehensive Platform for Small RNA-seq Analysis},
  year = {2026},
  url = {https://github.com/osman12345/sRNAtlas},
  version = {v0.1.0}
}

License

This project is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later).

See the LICENSE file for details, or read the full license at: https://www.gnu.org/licenses/agpl-3.0.html

Acknowledgments

Streamlit - Web application framework
Bowtie - Short read aligner
miRBase - miRNA database
RNAcentral - ncRNA database
pyDESeq2 - Differential expression

Built with ❤️ for the small RNA research community

Back to top ⬆️

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.streamlit		.streamlit
app		app
assets		assets
config		config
data		data
docs		docs
jobs		jobs
modules		modules
output		output
scripts		scripts
tests		tests
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DEBIAN_SERVER_DEPLOY.md		DEBIAN_SERVER_DEPLOY.md
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
STREAMLIT_CLOUD_DEPLOY.md		STREAMLIT_CLOUD_DEPLOY.md
VERSION		VERSION
docker-compose.yml		docker-compose.yml
install_conda.sh		install_conda.sh
install_linux.sh		install_linux.sh
install_macos.sh		install_macos.sh
install_server.sh		install_server.sh
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_minimal.txt		requirements_minimal.txt
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

sRNAtlas

About

Overview

Why sRNAtlas?

Features

📋 Analysis Pipeline

Core Modules

Supported RNA Types

Advanced Features

New in v1.4.0

Installation

Prerequisites

Option 1: Conda (Recommended)

Option 2: pip + Manual Tools

Option 3: Docker

Verify Installation

Quick Start

1. Launch the Application

2. Create a Project

3. Run the Pipeline

4. Export Results

Documentation

In-App Help

System Requirements

Hardware

Software

Python Packages

Input/Output Formats

Input Files

Output Files

Configuration

Alignment Parameters (Bowtie)

Trimming Parameters (Cutadapt)

DE Analysis Parameters

Testing

Test Coverage

Project Structure

Troubleshooting

Common Issues

Getting Help

Contributing

Development Setup

Code Style

Roadmap

Citation

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages