Skip to content

CUAnschutzBDC/sc-islet-longread-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis Pipeline for "Optimizing Single-Cell Long-Read Sequencing in Pancreatic Islets"

This repository contains the full analysis pipeline and associated scripts for the paper:

Optimizing Single-Cell Long-Read Sequencing for Enhanced Isoform Detection in Pancreatic Islets
Maria S. Hansen, Christopher J. Hill, Lori Sussel, Kristen L. Wells
bioRxiv, 2025.04.30.651101
https://doi.org/10.1101/2025.04.30.651101

Overview

This study uses Nanopore single-cell long-read RNA sequencing of mouse pancreatic islets, with cells captured using 10x Genomics technology. The goal was to evaluate whether 5′ capture, targeted insulin depletion, and extended reverse transcription can improve read length and isoform detection in single-cell long-read RNA-sequencing of pancreatic islets. This study utilizes the DTUrtle R package to assess differential transcript usage from the resulting data.

Please contact kristen.wells-wrasman@cuanschutz.edu with any questions.

What This Repository Does

This repository provides a complete pipeline for analyzing Nanopore long-read single-cell RNA-seq data. Specifically, it:

  • Processes Nanopore single-cell long-read sequencing data.
  • Performs quality control, alignment, and cell barcode quantification.
  • Conducts dimensionality reduction and read length analysis.
  • Analyzes differential transcript usage (DTU) to assess isoform expression.
  • Processes sequencing data from 10X Genomics 3′ and 5′ single-cell platforms to analyze read start site bias.
  • Processes sequencing data from bulk RNA-seq experiments for insulin depletion analysis .
  • Generates all plots featured in the manuscript.

Repository Structure

single_cell_workflow/

A three-stage pipeline for processing single-cell long-read data:

  1. 01_wf_single_cell/
    Contains configuration files and scripts for running EPI2ME's wf-single-cell pipeline using Nextflow. This workflow includes steps for quality control, alignment, cell quantification, and UMAP plotting.

  2. 02_postprocessing/
    Contains the post-processing pipeline for analyzing single-cell RNA-seq data using a Snakemake pipeline. It processes output from the wf-single-cell pipeline that was executed in Step 1, generating barcodes and read length analysis data.

  3. 03_plotting/
    Contains scripts used to generate all figures in the manuscript, including UMAPs, gene and transcript identification plots, read length distributions, isoform usage plots, and insulin-depletion plots for single-cell long-read libraries.

insulin_depletion/

This directory contains a Snakemake workflow and R scripts for the bulk RNA-seq analysis of the insulin depletion experiment.

ngs_plots/

Scripts for generating plots comparing read start site distributions between 3′ and 5′ single-cell RNA-seq datasets using short-read next generation sequencing (NGS) data, aimed at investigating differences in internal priming

Outputs

single_cell_workflow/

  1. 01_wf_single_cell/
    Output: matrix that can be loaded into seurat.

  2. 02_postprocessing/
    Output: Doublet identification plots, PCA plots, UMAPs, clustering tree, histograms and percent tagged plots based on read length data. Seurat objects.

  3. 03_plotting/
    Output: The majority of figures in the manuscript, including UMAPs, read length distributions, isoform usage plots, and insulin-depletion in single-cell long-read libraries.

insulin_depletion/

Output: Volcano plot displaying differentially expressed genes between insulin-depleted and non-depleted pancreatic islet samples from bulk RNA-seq data.

ngs_plots/

Output: NGS plot showing read start site bias between 3' and 5' libraries.

How to Run

Each subdirectory includes its own README with setup instructions, environment configuration, and execution steps.

Example Outputs

  • Differential Transcript Usage (DTU) between alpha and beta cells

    DTU_analysis\

  • Read length distributions between published datasets

    read_length_distribution\

Data Availability

Sequencing data available at: [link]

Prerequisites

Docker files for building Singularity images are located within each subdirectory.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors