Skip to content

Demultiplexing from only read2 barcodes pushes all outputs to untrimmed #849

@elliot-imler

Description

@elliot-imler
  • Cutadapt v5.1 installed via pip and Python version 3.11.12
    Was trying to do some secondary demultiplexing and came across what I believe to be an issue handling paired read demultiplexing:
    cutadapt \
        -U 28 \
        -G ^file:"$BARCODE_FILE" \
        -o $OUTPUT_DIR/${base_name}-{name}_R1.fastq.gz \
        -p $OUTPUT_DIR/${base_name}-{name}_R2.fastq.gz \
        --untrimmed-output $OUTPUT_DIR/${base_name}_unknown_R1.fastq.gz \
        --untrimmed-paired-output $OUTPUT_DIR/${base_name}_unknown_R2.fastq.gz \
        --cores $cores_per_file \
        "$r1_file" "$r2_file" > "$log_file" 2>&1

This properly finds the barcodes and performs trimming according to the log file (not attached due to proprietary sample naming) but all reads are output to the "unknown" file instead of individual output files...
Thought it was an issue parsing the {name} barcode replacement string but simply switching the read inputs and making it a read1 barcode gives the expected behavior:

    cutadapt \
        -u 28 \
        -g ^file:"$BARCODE_FILE" \
        -o $OUTPUT_DIR/${base_name}-{name}_R2.fastq.gz \
        -p $OUTPUT_DIR/${base_name}-{name}_R1.fastq.gz \
        --untrimmed-output $OUTPUT_DIR/${base_name}_unknown_R2.fastq.gz \
        --untrimmed-paired-output $OUTPUT_DIR/${base_name}_unknown_R1.fastq.gz \
        --cores $cores_per_file \
        "$r2_file" "$r1_file" > "$log_file" 2>&1

I'm not sure if this is intended, for example requiring {name1} and {name2} for paired reads, but I don't think it is clear from the documentation as i understand it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions