1.Develop a validation pipeline (including data validation, model validation, fit of input to model, fit of data not used for modeling and uncertainty of the model) for assessing IHM structures deposited to PDB-IHM
-
docsdocumentation for all classes and functions (sphinx) -
srcrelevant source code for execution -
staticrelevant css, js files for static and dynamic HTML reports along with output files from validation [htmls,images,pdfs,json,results,supplementary] -
templatesall HTML templates for static HTML reports, PDF files and supplementary table -
teststests for classes -
example_sasexample script to create validation reports for SAS data -
example_imp_modelsexample script to create validation reports for IMP models -
example_summary_tableexample script to create validation reports for summary table
-
WriteReportclass to write dictionary for jinja2, HTML and PDF outputs -
plotsquality at glance plots from all different analysis -
get_excluded_volumeclass to calculate excluded volume and other relevant statistics -
get_molprobity_informationclass to get data from molprobity, used only for atomistic models -
sas_validationclass to perform validation of models built using SAS datasets -
sas_validation_plotsplots based on SAS validation analysis -
cx_validationclass to perform validation of models built using CX-MS datasets -
cx_validation_plotsplots based on CX-MS validation analysis -
em_validationclass to perform validation of models built using EM datasets
Rather than installing and configuring all dependencies (below), you can build a Docker (or podman) container by
- Downloading the
ATSASCentOS 7 RPM and placing it in thedockersubdirectory (this cannot be redistributed by us; you must sign up at their web site for an academic license). - Building the image with
docker build -t ihm-validation docker
The resulting image has all dependencies in the default PATH and this
repository available in the /IHMValidation directory; no further configuration
should be necessary.
This initial setup is performed once.
Create and activate a Python3.8 virtual environment.
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip3 install -r dependencies.txt
Install the following packages based on your OS.
Create a local environment file and add the relevant variables.
$ touch .env
$ nano/vi .env
The variables to add to the .env file can be seen below (fill in the quotations with paths to the relevant values).
ATSAS=""
Molprobity_ramalyze=""
Molprobity_molprobity=""
Molprobity_clashscore=""
Molprobity_rotalyze=""
wkhtmltopdf=""
Few pointers:
-
ATSAS variable should contain the path to datcmp functionality, example :
ATSAS-3.0.3-1/bin/datcmp -
Molprobity variables should point to respective functionalities, example :
build/bin/molprobity.ramalyze -
wkhtmltopdf variable should point to the binary
- One common error, depending on your OS and webdriver is from bokeh/selenium. This error is usually displayed as:
RuntimeError: Neither firefox and geckodriver nor a variant of chromium browser and chromedriver are available on system PATH. You can install the former with 'conda install -c conda-forge firefox geckodriver'.
This error originates from converting htmls to svgs. Please install/update your webdriver. You can do this by adding pre-installed binaries to path variable or install packages using the suggested conda command.
To add pre-installed binaries (firefox and geckodriver), find the path of the binaries. Please try using conda first, add paths to binaries only if you are unable to use conda. You can do that in two steps:
-
You can use the command
which firefoxorwhich geckodriverto get path to respective binaries. If you don't have the pre-installed binaries, install fireforx/geckodriver using brew and then locate binaries. You can also download geckodriver from themozilla github page. -
You should then open the webdriver.py file in bokeh and add the path to appropriate functions. Open the webdriver.py file using the following path
.venv/lib/python3.8/site-packages/bokeh/io/webdriver.py. Editcreate_firefox_webdriverfunction by changing the variables forfirefoxto the binary path (delete which firefox) and changegeckodriverto it's pre-installed binary path (delete which geckodriver).
- Another potential error could arise from having another env variable file or not having the file in the same directory. This error is displayed as:
ValueError: UndefinedValueError('{} not found. Declare it as envvar or define a default value.'.format(option)).
A solution to this error is using Autoconfig of decouple library to add the path to your .env file. See this stackoverflow post for specific details.
- An error could arise from not being able to access the executable, even though the path is found. This error can occur with the ATSAS package and is displayed as:
PermissionError: [Errno 13] Permission denied: '/ATSAS-3.0.3-1'
A solution to this is to open your .venv/bin/activate file and add in the above six variables at the top as 6 lines of code using the format export KEY=VALUE. See this stackoverflow post for specific details.
After the initial setup, you can start executing the scripts to generate validation reports. Here are the steps:
- Go to the
exampledirectory. - Command to execute:
python Execute.py -f PDBDEV_00000009.cif - If new software was used to build the structure, update reference.csv, located in the templates directory with appropriate software name, PubMed link, and citation.
The input to the Execute.py script is a PDBDEV file in cif format. The output includes directories and files that are listed below:
Directory ../Validation/PDBDEV_00000009Directory ../Validation/PDBDEV_00000009/imagesDirectory ../Validation/PDBDEV_00000009/pdfDirectory ../Validation/PDBDEV_00000009/jsonDirectory ../Validation/PDBDEV_00000009/supplementaryDirectory ../Validation/PDBDEV_00000009/htmlsDirectory ../Validation/PDBDEV_00000009/csv
Here's the description of all the directories:
Validationis the main head directory and is located one step above the example directory [you can see that in the way this repo is structured]PDBDEV_00000009is the entry directory with relevant filesPDBDEV_00000009/imagescontains all the images generated for this entryPDBDEV_00000009/pdfcontains all the pdf files generated for this entry, including pdf version of the validation reportPDBDEV_00000009/jsoncontains the validation report in a json format as key-value pairsPDBDEV_00000009/supplementarycontains the summary table in a PDF formatPDBDEV_00000009/htmlscontains corresponding html pagesPDBDEV_00000009/csvcontains detailed molprobity tables for download
The Validation folder that is generated needs to be transferred to the server.
If individual entries are being evaluated, move/copy the entry directory from the local validation directory into the server's validation directory.
Author(s): Sai J. Ganesan