FeS_RedPred/help.txt at main · CompBtBs/FeS_RedPred · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
This repository contains:
     -	2 input files:
	      -	Database Redox Pot Fe2S2 proteins.xlsx
	      -	tableAmm.txt, a utility file with amino acids' parametrization and list of other cofactors to be
		counted by features_calculator.py
     -	2 utility modules:
	      -	utils.py, with functions used by both features_calculator.py and em_predict2.py
	      -	ML_models.py a dictionary for models with hyperparameters grid which will be tuned with GridSearch optimization
     - 	pdb-files folder, which contains all pdb files of dataset's proteins, including in silico generated mutants
     - 	Folder A contains the scripts for training seprate models for each specific combination of radius values r1 and r2:
	      -	features_calculator.py script uded to compute molecular descriptors values. These descriptors
		are saved in a dataset_features_r1_r2.xlsx file which serves as input for model training.
	      -	em_predict.py, the main script used to launch models training and to test their performance
     -	Folder B includes the code for training a single model that simultaneously considers all features calculated
	for every r1 and r2:
	      -	features_calculator.py
	      -	total.py to merge all features in one single file total.xlsx, avoiding repetitions
	      -	em_predict.py

The remaining models were constructed using the scripts in folder A and modifying the features_calculators.py
output files removing the selected features.

Warning: we run all codes in linux, when running features_calculator.py in windows a modification on PDBParser library
is needed (l.192 resname = line[17:20].replace(' ',''))