Skip to content

Most file are empty after split #28

@JihedC

Description

@JihedC

Hi! Thank you for developing this tools.
I would like to use the function split in order to generate a bam file per single cell.

The structure of my bam file (obtained from Cell Ranger) is:

samtools view $BAM/possorted_genome_bam.bam | head
A00379:517:HWLKKDSX2:1:1542:2483:6668   16      chr1    3018437 0       150M1S  *       0       0       TCTTTATTCCTTCCTTGACCAAGGTATCATTGAACAGAGTGTTGTTCAGTCTCCACGTAAATGTTGGCTTTCTATTATTTATGTTGTTATTGAAGATCAGCCTTAGTCCATGGTGATCTGATAGGATGCATGGGACAATTTCGAAATTTTC       FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF       NH:i:7  HI:i:1  AS:i:136        nM:i:6  RG:Z:WT1_GEX_PC_mm10_introns:0:1:HWLKKDSX2:1  RE:A:I  xf:i:0  CR:Z:CTAGCCTAGGAATTAC   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:CTAGCCTAGGAATTAC-1 UR:Z:ACCCAACACG UY:Z:FFFFFFFFFF UB:Z:ACCCAACACG

So I don't have a BX tag. I would like instead to use the corrected barcode tag "CB"

So I used the command:

bxtools split $BAM/possorted_genome_bam.bam -a test --tag CB > $OUTPUT/count.txt

This is where it didn't work properly, this command generated many BAM files from which 30 contained reads and more than 7000 were empty files.

The files that contain reads show this error message:

samtools view test.GAATAAGTCTGAGGGA-1.bam | head -n 5
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated
A00379:517:HWLKKDSX2:2:1224:19768:20102 1024    chr1    6214342 255     151M    *       0       0       ATTTCGGGGCAGCAGATGAGGGCCCCAGATCTGTGCTGGTGCTCACTCGTCAGCCTCCGGTTCCCCTGTTGGGGCTGCCCCAGGTTTGGCGAGGTCGGTCTGCCGCGGCCAGAAGGTCACGCTCACCTTGGGGCCGTCCAAGGCAAGCACC       FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFF       NH:i:1  HI:i:1  AS:i:149        nM:i:0  RG:Z:WT1_GEX_PC_mm10_introns:0:1:HWLKKDSX2:2  TX:Z:ENSMUST00000159618,+98,151M;ENSMUST00000191825,+801,151M   GX:Z:ENSMUSG00000090031 GN:Z:4732440D04Rik      fx:Z:ENSMUSG00000090031       RE:A:E  xf:i:17 CR:Z:GAATAAGTCTGAGGGA   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:GAATAAGTCTGAGGGA-1 UR:Z:GCTCATCGCT UY:Z:FFFFFFFFFF UB:Z:GCTCATCGCT

Do you have an idea what the problem can be?

Thank in advance for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions