Lipid classification tool
#Usage#
##Installation
bundle install should do most of the work for you, however, it currently references unreleased code in Rubabel. I use that as a local git repository. Due to the large file size of the DB, it will take some additional preparatory work to prepare this. I'll have instructions posted on Rubabel after I figure that step out. I expect to make that file available via Dropbox for local caching.
from your root directory run ./bin/write_arffs to see the options
Typical usage might include these options:
-m Nwhere N is the number of threads-d FOLDERwhere folder is the directory (it will make it for you) where you want to generate the files-rif you want to clean out that folder, ie you are iterating in place.-f input.ymlwhere you provide the input file in YAML format containing the classifications
So, if I have all_lmids.yml' in the root, and have 6 cores, I might run: ./bin/write_arffs -f all_lmids.yml -d all -m 6`
Or, I might want to rerun that analysis on one core
./bin/write_arffs -f all_lmids.yml -d all2
Corrected LMIDS won't change the LMID in the output, but does change the classification used by WEKA. These are loaded from the hash contained in corrections.yml.
Again, from the root directory ./bin/classify_lipids will show you the options.
Typical usage examples:
-d FOLDERwhich is the FOLDER or directory wherewrite_arffsplaced its output files-l list.txtwhere list.txt is a file containing one LMID per line, for batch analysis of LMIDS in the classifier--lmids LMFA01010001,LMST02040023,LMSL05010014where you can provide a list of LMIDS for classification-twill add timing outputs to the analysis.--run_wekais required to generate a new WEKA analysis. Otherwise, the analysis will just pull existing classifcation information from the directory. You must do this the first time you runclassify_lipidson a new directory.
So, to analyze a list of lmids in lmids.txt, working off the generated analyses (all, all2) from the previous section run:
./bin/classify_lipids -d all -l lmids.txt -t --run_weka
Repeat that for the other analysis with:
./bin/classify_lipids -d all2 -l lmids.txt -t --run_weka
Now, if you want to run a new list of LMIDS instead of the other one, you can skip the run_weka option:
./bin/classify_lipids -d all -l lmids.txt -t