This repository provides a very small example of running a Snakemake workflow locally that calls Python and R code to generate a simple visualization.
The workflow:
- Python: generate a mock gene expression count table (TSV).
- R + ggpubr: visualize the mock counts as a boxplot with a p-value.
This example is intended for teaching and local experimentation. It is kept intentionally simple so that you can focus on the Snakemake concepts:
- how rules declare input and output files,
- how Snakemake tracks dependencies, and
- how to integrate small Python and R scripts.
You need:
- Python (3.9+ recommended),
- R (any modern version),
- Snakemake installed in some environment, and
- the R package
ggpubr(plus its dependencies such asggplot2) for the visualization rule.
This repo includes an environment.yml that installs Snakemake, base R,
and r-ggpubr for visualization:
mamba env create -f environment.yml
mamba activate snakemake-exampleIf you use conda instead of mamba, replace mamba with conda.
If you do not already have mamba installed, see the
Mamba installation guide,
then return to the Install & Setup section here.
From the top level of this repository:
snakemake -j 1Snakemake will:
- Use Python to generate a mock counts table in
results/mock_counts.tsv. - Use R and
ggpubrto generate a mock counts boxplot inresults/mock_counts_boxplot.png.
You can re-run the command and Snakemake will only recompute steps whose inputs have changed.
$ tree -a snakemake-example/
snakemake-example/
├── README.md
├── Snakefile
├── environment.yml
├── requirements.txt
├── results
│ ├── mock_counts.tsv # created by Snakemake (Python step)
│ └── mock_counts_boxplot.png # created by Snakemake (R + ggpubr step)
└── scripts
├── generate_mock_counts.py
└── visualize_counts.RThe files under results/ are created by Snakemake and can be deleted
and regenerated at any time.