You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The project ID used was TCGA-OV, which studies specifically Ovarian Serous Cystadenocarcinoma.
Project Navigation
csv
data.csv | raw complimation of every patient and every data field
SurvivalNormal.csv | normalized ovarian cancer patients with just gene fields and survival outcome
SurvivalNormalBoth.csv | normalized ovarian and breast cancer patients with gene fields, age, cancer information, and prognosis.
ProgressionNormal.csv | normalized ovarian cancer patients with gene fields and progression outcome
RecurrenceNormal.csv | normalized ovarian cancer patients with gene fields and recurrence outcome
R_Analysis
analysis.Rmd | R markdown script that conducts analysis of dataset, creating different types of models
gen_csv.Rmd | R markdown script that generates SurvivalNormalBoth.csv from data.csv
control_analysis.Rmd | R markdown script that conducts same analysis of dataset as analysis.Rmd however it substitutes the actual data for randomly generated data to serve as a control.
control_trials.py | Python script to run the R markdown script multiple times and average the results. It's use-case is python control_trials.py 100 where the argument is the number of trials to run.
Hiearchical_Clustering
hierarchical_clustering.py | python script that generates the figures for hierarchical clustering
NeuralNetwork
choosing_genes.py | the python script for selecting the optimal features
graphing_feature_selection.py | python script that graphs the results of choosing_genes.py
graphing_neural_performance.py | python script that graphs the performance of the neural networks
neural_model.py | python script that includes functions for building a model on the dataset. Also includes a general evaluation of the a model
SurvivalExperiment.txt | the results of choosing_genes.py optimizing survival predictions
ProgressionExperiment.txt | the results of choosing_genes.py optimizing progression predictions
RecurrenceExperiment.txt | the results of choosing_genes.py optimizing recurrence predictions
Learning_Curve
graphing_learning_curve.py | graphs the learning curves from output files (hard-coded names)
learning_curve.py | generates a learning curve datafile on neural network
learning_curve.Rmd | generates a learning curve datafile on logistic regression model
README.md | a project description
LICENSE | the license for this project
Dataset Fields
General Info
Patient ID, Cancer Type, Survival Status, Survival in Days (7300 represents continued survival), Progression Status, Recurrence Status, Age, Race