Quickstart

Overview

The nimble workflow consists of three main steps:

Generate a reference library from .fasta or .csv reference sequence data.
Align reads to the reference library using .fastq.gz or .bam input files.
Generate a summarized output suitable for downstream analysis, such as integration with Seurat.

This guide will walk through a basic example using default settings. The syntax assumes that you've installed using Python and pip. If you're using docker, replace the python -m nimble at the beginning of each line with docker run ghcr.io/bimberlab/nimble:latest.

Usage

1. Generate a Reference Library

Before aligning reads, you must first generate a nimble library. This library serves as the reference against which sequencing reads will be aligned.

To create a library, use the following command:

python -m nimble generate --file reference.fasta --output_path library.json

--file: Input reference sequences in FASTA format.
--output_path: Output file storing the reference library in nimble's format.

Alternatively, if you have a CSV file with "name" and "sequence" columns, use:

python -m nimble generate --file reference.csv --output_path library.json

Or, if your CSV file lacks the "sequence" column but contains metadata for the FASTA file, corresponding row-by-row, you can provide both:

python -m nimble generate --file reference.fasta --opt-file reference.csv --output_path library.json

The produced output is a .json file containing your sequence data, any additional metadata provided via the optional .csv file, and a set of configuration parameters dictating how nimble should go about aligning the file. These parameters have defaults set for a fairly lenient alignment. For more information about the configuration options, see the library file format documentation.

2. Align Reads to the Reference Library

Once you have a nimble library, you can align your sequencing data.

python -m nimble align -r library.json -i input.fastq.gz -o results.tsv

For paired-end data:

python -m nimble align -r library.json -i input_R1.fastq.gz input_R2.fastq.gz -o results.tsv

For 10x BAM files (either R1-only or R1 and R2):

python -m nimble align -r library.json -i input.bam -o results.tsv

-r: The reference library file generated in the previous step.
-i: The input sequencing reads (can be single or paired FASTQ, or a BAM file).
-o: The output alignment results.

Visit the output file format documentation for more information about the files produced by nimble align.

3. Generate a Feature-by-Cell Matrix

Many scRNA-seq packages like Seurat expect count quantifications to be in a feature-by-cell matrix, which can be generated with the report command:

python -m nimble report -i results.tsv -o summary.tsv

-i: The output from the alignment step.
-o: The processed output, formatted as a feature-by-cell matrix.

Visit the CLI parameters documentation for more detail about each command, and the library file format documentation for information about how to configure alignment behavior. See the separate R package nimbler for integrating nimble data into Seurat objects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quickstart

Overview

Usage

1. Generate a Reference Library

2. Align Reads to the Reference Library

3. Generate a Feature-by-Cell Matrix

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally