-
Notifications
You must be signed in to change notification settings - Fork 63
Open
Description
Dear all,
I am writing because I am experiencing some issues with the TADbit tools, where all mapped reads are flagged as either "too close to RES" or "too short". Specifically, I have two HiC libraries generated with Arima-HiC (150bp PE reads), which I have mapped separately in an iterative fashion without specifying any restriction enzyme, using the following commands:
$tadbit map --fastq "$FASTQ.1_R1" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 1 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative
$tadbit map --fastq "$FASTQ.1_R2" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 2 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative
$tadbit map --fastq "$FASTQ.2_R1" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 1 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative
$tadbit map --fastq "$FASTQ.2_R2" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 2 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative
After this, I merged all files with:
$tadbit parse -w Large/ --genome ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.fa
Finally, for reads filtering, I used:
$tadbit filter -w Large/ -C 10 --apply 1 2 3 4 6 7 9 10
Getting intersection between read 1 and read 2
Get insert size...
- median insert size = 356.0
- double median absolution of insert size = 87.0
- max insert size (when a gap in continuity of > 10 bp is found in fragment lengths) = 1356
Using the maximum continuous fragment size(1356 bp) to check for pseudo-dangling ends
Using maximum continuous fragment size plus the MAD (1443 bp) to check for random breaks
identify pairs to filter...
Filtered reads (and percentage of total):
Mapped both : 103,322,115 (100.00%)
-----------------------------------------------------
1- self-circle : 5,101,974 ( 4.94%)
2- dangling-end : 27,421,527 ( 26.54%)
3- error : 10,421,255 ( 10.09%)
4- extra dangling-end : 0 ( 0.00%)
5- too close from RES : 103,322,116 (100.00%)
6- too short : 103,322,116 (100.00%)
7- too large : 0 ( 0.00%)
8- over-represented : 74,076,173 ( 71.69%)
9- duplicated : 45,211,936 ( 43.76%)
10- random breaks : 0 ( 0.00%)
saving to file 0 reads without.
As you can see from the TADbit log, all reads appear to have been flagged as either "too short" or "too close to RES". Do you have any idea if there is something wrong with my commands and/or if I am missing something?
Thanks in advance for your support!
All the best,
Jacopo
Metadata
Metadata
Assignees
Labels
No labels