QC pipeline

We need a pipeline for preprocessing steps in assessing data quality and data cleaning before running the predictor. Currently there is no such mechanism in place. Operations pipeline would run:
* identify structure in missingness of data
* identify and flag outlier samples
* run some unsupervised analyses on the samples. e.g. pca, hierarchical clustering
* For continuous-valued data, compare several similarity metrics to find one which best separates classes. e.g. RNAcorr.R written by SP for PanCancer
* Hierarchical clustering of classes and PCA, following same idea. 
* Running univariate test to prune matrix of variables that goes into netDx.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QC pipeline #19

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

QC pipeline #19

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions