Analysis Pipeline for "Optimizing Single-Cell Long-Read Sequencing in Pancreatic Islets"

This repository contains the full analysis pipeline and associated scripts for the paper:

Optimizing Single-Cell Long-Read Sequencing for Enhanced Isoform Detection in Pancreatic Islets
Maria S. Hansen, Christopher J. Hill, Lori Sussel, Kristen L. Wells
bioRxiv, 2025.04.30.651101
https://doi.org/10.1101/2025.04.30.651101

Overview

This study uses Nanopore single-cell long-read RNA sequencing of mouse pancreatic islets, with cells captured using 10x Genomics technology. The goal was to evaluate whether 5′ capture, targeted insulin depletion, and extended reverse transcription can improve read length and isoform detection in single-cell long-read RNA-sequencing of pancreatic islets. This study utilizes the DTUrtle R package to assess differential transcript usage from the resulting data.

Please contact kristen.wells-wrasman@cuanschutz.edu with any questions.

What This Repository Does

This repository provides a complete pipeline for analyzing Nanopore long-read single-cell RNA-seq data. Specifically, it:

Processes Nanopore single-cell long-read sequencing data.
Performs quality control, alignment, and cell barcode quantification.
Conducts dimensionality reduction and read length analysis.
Analyzes differential transcript usage (DTU) to assess isoform expression.
Processes sequencing data from 10X Genomics 3′ and 5′ single-cell platforms to analyze read start site bias.
Processes sequencing data from bulk RNA-seq experiments for insulin depletion analysis .
Generates all plots featured in the manuscript.

Repository Structure

`single_cell_workflow/`

A three-stage pipeline for processing single-cell long-read data:

01_wf_single_cell/
Contains configuration files and scripts for running EPI2ME's wf-single-cell pipeline using Nextflow. This workflow includes steps for quality control, alignment, cell quantification, and UMAP plotting.
02_postprocessing/
Contains the post-processing pipeline for analyzing single-cell RNA-seq data using a Snakemake pipeline. It processes output from the wf-single-cell pipeline that was executed in Step 1, generating barcodes and read length analysis data.
03_plotting/
Contains scripts used to generate all figures in the manuscript, including UMAPs, gene and transcript identification plots, read length distributions, isoform usage plots, and insulin-depletion plots for single-cell long-read libraries.

`insulin_depletion/`

This directory contains a Snakemake workflow and R scripts for the bulk RNA-seq analysis of the insulin depletion experiment.

`ngs_plots/`

Scripts for generating plots comparing read start site distributions between 3′ and 5′ single-cell RNA-seq datasets using short-read next generation sequencing (NGS) data, aimed at investigating differences in internal priming

Outputs

`single_cell_workflow/`

01_wf_single_cell/
Output: matrix that can be loaded into seurat.
02_postprocessing/
Output: Doublet identification plots, PCA plots, UMAPs, clustering tree, histograms and percent tagged plots based on read length data. Seurat objects.
03_plotting/
Output: The majority of figures in the manuscript, including UMAPs, read length distributions, isoform usage plots, and insulin-depletion in single-cell long-read libraries.

`insulin_depletion/`

Output: Volcano plot displaying differentially expressed genes between insulin-depleted and non-depleted pancreatic islet samples from bulk RNA-seq data.

`ngs_plots/`

Output: NGS plot showing read start site bias between 3' and 5' libraries.

How to Run

Each subdirectory includes its own README with setup instructions, environment configuration, and execution steps.

Example Outputs

Differential Transcript Usage (DTU) between alpha and beta cells

\
Read length distributions between published datasets

\

Data Availability

Sequencing data available at: [link]

Prerequisites

Docker files for building Singularity images are located within each subdirectory.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
insulin_depletion		insulin_depletion
ngsplot		ngsplot
single_cell_workflow		single_cell_workflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis Pipeline for "Optimizing Single-Cell Long-Read Sequencing in Pancreatic Islets"

Overview

What This Repository Does

Repository Structure

`single_cell_workflow/`

`insulin_depletion/`

`ngs_plots/`

Outputs

`single_cell_workflow/`

`insulin_depletion/`

`ngs_plots/`

How to Run

Example Outputs

Data Availability

Prerequisites

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analysis Pipeline for "Optimizing Single-Cell Long-Read Sequencing in Pancreatic Islets"

Overview

What This Repository Does

Repository Structure

single_cell_workflow/

insulin_depletion/

ngs_plots/

Outputs

single_cell_workflow/

insulin_depletion/

ngs_plots/

How to Run

Example Outputs

Data Availability

Prerequisites

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`single_cell_workflow/`

`insulin_depletion/`

`ngs_plots/`

`single_cell_workflow/`

`insulin_depletion/`

`ngs_plots/`

Packages