Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions from 3D Structures
- Implementation of other baselines can be found on GIGN.
- This repository contains the source code for PLA prediction. For structure-based virtual screening (SBVS), please refer to our dedicated repository at EHIGN_SBVS on GitHub.
All data used in this paper are publicly available at the following locations:
The preprocessed data can be downloaded from Graphs.
dgl==0.9.0 networkx==2.5 numpy==1.19.2 pandas==1.1.5 pymol==0.1.0 rdkit==2022.3.5 scikit_learn==1.1.2 scipy==1.5.2 torch==1.10.2 tqdm==4.63.0 openbabel==3.3.1 (conda install -c conda-forge openbabel)
Alternatively, install the environment using the provided YAML file at ./environment.yaml.
./data: Contains information about various datasets. Download and organize preprocessed datasets as described../config: Parameters used in EHIGN../log: Logger../model: Contains model checkpoints and training records.- Scripts and Implementations: Various Python files implementing models, preprocessing, training, and testing.
- Download the preprocessed datasets and organize them in the
./datafolder. - Run
python train.py.
- Run
python test.py(modify file paths in the source code if necessary).
- Run a demo using provided examples:
python preprocess_complex.pypython graph_constructor.pypython train_example.py
-
Organize the data like: -data -external_test -pdb_id -pdb_id_ligand.mol2 -pdb_id_protein.pdb
-
Execute the following commands:
python preprocess_complex.pypython graph_constructor.pypython test.py- (Modify file paths in the source code if necessary)
- Use datasets found in the
./cold_start_datafolder. - Execute scripts
train_random.py,train_scaffold.py, andtrain_sequence.pyif the original training set has been processed.