This is a bit retroactive at this point.
It was just a small side project, but seems to be quite nice
So essentially we need an easy way to visualize an entire MiSeq run.
That is, easily point out samples that have issues such as:
- Too many reads assigned
- Too few reads assigned
- Low quality
-
Forward/Reverse read count that don't match well
- fastqc run on all samples for other info
So my initial idea was to grab the following data for each sample:
- Samplename
-
Total Reads (F+R)
-
F Reads
- Avg F Qual
- Avg F Length
-
F bases
-
R Reads
- Avg R Qual
- Avg R Length
-
R Bases
Once those stats were generated I looked at them in excel and noticed it would be really nice to color cells in the matrix that were outside of STDEV
So I colored them based on 6 criteria
- +1, +2, +3 and -1, -2, -3 STDEV from the mean in each column
- Each stddev would get slightly more bold color gradient(green for above, red for below)
The end result will be
- single csv file with base stats as listed above
- single html file that contains the colored matrix as the prototype excel file had
- html file would contain links to fastqc for R1 and R2 reads
This is a bit retroactive at this point.
It was just a small side project, but seems to be quite nice
So essentially we need an easy way to visualize an entire MiSeq run.
That is, easily point out samples that have issues such as:
Forward/Reverse read count that don't match well
So my initial idea was to grab the following data for each sample:
Total Reads (F+R)
F Reads
F bases
R Reads
R Bases
Once those stats were generated I looked at them in excel and noticed it would be really nice to color cells in the matrix that were outside of STDEV
So I colored them based on 6 criteria
The end result will be