Minor bias against low coverage cells

ASHLEYS QC tends to give score <0.3 for libraries with <200k aligned reads, even though they contain good-quality Strand-seq data according to my manual QC. To be clear, this is a relatively minor issue and I will continue to happily use ASHLEYS QC as is, but I will probably do manual QC from now on for low coverage libraries. Overall---thanks for making such a great QC tool!

To confirm that this issue wasn't a quirk of the particular library pool, I also looked at the 28 libraries with between 5k and 200k aligned reads from a completely separate library prep experiment. Half of them (14) scored <0.5 with ASHLEYS QC but looked perfectly fine to me.  

As an aside, it could also be argued that we don't want libraries in the 100-200k range for aligned reads anyway (most of the errors below are for libraries in that range). I personally think they are valuable, because they still show SCEs and contribute reads towards inversion calls and phasing.

The command:
```
ashleys.py -j $1 features -f ./output/bam -w 5000000 2000000 1000000 800000 600000 400000 200000 -o ./output/bam/features.tsv
ashleys.py predict -p ./output/bam/features.tsv -o ./output/bam/quality.txt -m scripts/tools/svc_default.pkl
```

Manual vs automated QC for 79 libraries. I did this blind to the ASHLEYS QC scores on libraries with 75 bp reads:
![automated_vs_manual_qc](https://github.com/friendsofstrandseq/ashleys-qc/assets/50639126/c3c25c25-e1d0-4647-87cf-2a258ca82458)


Some example of disagreements:
| ASHLEYS QC score  | Manual QC | Aligned reads |
|-------------------|-----------|---------------|
| 27.72%            | good      | 140946        |
| 2.53%             | good      | 123526        |
| 2.63%             | good      | 136970        |
| 2.07%             | good      | 153016        |
| 9.91%             | good      | 127418        |
| 2.37%             | good      | 124374        |
| 14.97%            | good      | 140564        |
| 4.82%             | good      | 121376        |
| 11.46%            | good      | 427612        |
| 20.49%            | good      | 153288        |

BreakpointR plots for two example libraries that I thought were good but ASHLEYS QC did not:
![lib_exs](https://github.com/friendsofstrandseq/ashleys-qc/assets/50639126/8f81a11a-1819-4d07-9d42-4ea8b3ee13db)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor bias against low coverage cells #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ASHLEYS QC score	Manual QC	Aligned reads
27.72%	good	140946
2.53%	good	123526
2.63%	good	136970
2.07%	good	153016
9.91%	good	127418
2.37%	good	124374
14.97%	good	140564
4.82%	good	121376
11.46%	good	427612
20.49%	good	153288

Minor bias against low coverage cells #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions