A simple and reliable Python wrapper for calculating Mold² molecular descriptors. This library automates the process of downloading, installing, and running the Mold² executables for seamless integration into your cheminformatics workflows.
- Automated Installation: Automatically downloads and sets up the correct Mold² executables for your operating system (Windows or Linux).
- Simple API: A clean and easy-to-use interface for calculating descriptors from RDKit molecule objects.
- Parallel Processing: Calculate descriptors for large datasets quickly using built-in support for multiprocessing.
- Data Handling: Returns descriptors in a convenient Pandas DataFrame, ready for analysis.
- Descriptor Information: Easily access details and descriptions for all 777 Mold² descriptors.
Mold² is a product designed and produced by the National Center for Toxicological Research (NCTR). FDA and NCTR retain ownership of this product. This Python wrapper is the work of Olivier J. M. Béquignon and is not affiliated with the original authors.
If you use this wrapper in your research, please cite the original Mold² publication in addition to this software package:
-
Original Mold² Paper:
Hong, H., Xie, Q., Ge, W., Qian, F., Fang, H., Shi, L., Su, Z., Perkins, R., & Tong, W. (2008). Mold², Molecular Descriptors from 2D Structures for Chemoinformatics and Toxicoinformatics. Journal of Chemical Information and Modeling, 48(7), 1337–1344. DOI: 10.1021/ci800038f
-
This Wrapper:
Please refer to the
CITATION.cfffile for details on how to cite this library.
- Python 3.9+
- RDKit
You can install the wrapper from PyPI:
pip install mold2-pywrapperHere is a basic example of how to calculate Mold² descriptors for a list of molecules.
from rdkit import Chem
from Mold2_pywrapper import Mold2
# A list of SMILES strings
smiles_list = [
# erlotinib
"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
# midecamycin
"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
# selenofolate
"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
# cisplatin
"N.N.Cl[Pt]Cl"
]
# Convert SMILES to RDKit molecule objects
mols = [Chem.MolFromSmiles(smiles) for smiles in smiles_list]
# Instantiate the Mold2 calculator
# This will automatically download the executables on first run.
mold2 = Mold2()
# Calculate descriptors
descriptors_df = mold2.calculate(mols)
print(descriptors_df.head())If you have already downloaded the Mold² ZIP file from the FDA website, you can install the executables directly from the file:
path_to_zipfile = 'path/to/your/Mold2-Executable-File.zip'
mold2 = Mold2.from_executable(path_to_zipfile)
# The executables are now installed for future use.
# You can now instantiate Mold2() normally.You can retrieve the description for any of the 777 descriptors by its index or get a dictionary of all of them.
# Get details for a single descriptor
print(mold2.descriptor_detail(15))
# Output: rotatable bond fraction
# Get details for all descriptors
all_descriptors = mold2.descriptor_details()
print(all_descriptors['D001'])
# Output: number of 6-membered aromatic rings (only carbon atoms)This project is licensed under the MIT License - see the LICENSE file for details.
def calculate(mols, show_banner=True, njobs=1, chunksize=100):Default method to calculate #### Parameters
- mols : Iterable[Chem.Mol]
RDKit molecule objects for which to obtain Mold2 descriptors. - show_banner : bool
Displays default notice about Mold2 descriptors. - njobs : int
Maximum number of simultaneous processes. - chunksize : int
Maximum number of molecules each process is charged of. - return_type : pd.DataFramce
Pandas DataFrame containing Mold2 molecular descriptors.Mold2 descriptors. If executables have not previously been downloaded, attempts to download and install them.
def descriptor_detail(index):Obtain detils about one descriptor.
- index : int
Index of the descriptor. - return_type : str
Description of the descriptor.
def descriptor_details():Obtain details about all descriptors.
- return_type : dict
Mapping of molecular descriptors with their details.
def from_executable(zipfile_path):Install executables and instantiate a Mold2 calculator.
- zipfile_path : str
Path to the user-downloaded ZIP file containing Mold2 executables.