SynCraft: Guiding Large Language Models to Predict Edit Sequences for Molecular Synthesizability Optimization
SynCraft is a reasoning-based framework that reframes synthesizability optimization not as a sequence translation task, but as a precise structural editing problem. Leveraging the emergent reasoning capabilities of Large Language Models (LLMs), SynCraft navigates the "synthesis cliff" where minimal structural modifications yield significant gains in synthetic feasibility.
By predicting executable sequences of atom-level edits rather than generating SMILES strings directly, SynCraft circumvents the syntactic fragility of LLMs while harnessing their chemical intuition.
- Generative Editing via In-Context Reasoning: Decouples strategic planning from chemical execution. The LLM acts as a chemical strategist, reasoning about synthetic liabilities before prescribing edits.
- Discrete Editing Action Space: Uses a precise JSON-based command set (
DEL_ATOM,ADD_BOND,MUTATE_ATOM, etc.) to modify molecular graphs deterministically, ensuring validity. - Interaction-Aware Optimization: Incorporates 3D protein-ligand interaction data (via AutoDock Vina and PLIP) into the prompting strategy to preserve critical pharmacophores during optimization.
- Synthesis Cliff Navigation: Focuses on minimal, high-impact edits to transform "unsynthesizable" molecules into accessible analogs without destroying the original scaffold.
SynCraft-Core/
├── assets/ # Data assets (input fixtures, not run outputs)
│ ├── reasoning.json # Golden examples with reasoning traces
│ ├── RIPK1.txt # Example molecule lists
│ └── unsolved.json # Input datasets
├── docs/ # Rebuttal response map + skill docs
│ └── REPRODUCING_EXPERIMENTS.md
├── notebooks/ # Jupyter notebooks for analysis
├── scripts/ # Runnable wrappers (shell + Python)
│ ├── dock_batch.py # Batch Vina docking with --save-pdb-dir
│ ├── inference_enhanced.sh
│ ├── inference_bioactivity_constrain.sh
│ ├── reproduce_table1.sh # Paper Table 1 (main result)
│ ├── reproduce_rag_ablation.sh # Random-RAG + LLM-as-judge
│ ├── reproduce_edit_path_audit.sh # Graph-matching audit (n=100 cliff pairs)
│ ├── reproduce_baseline_coverage.sh # LIGAN / AR / DiffSBDD coverage
│ └── reproduce_binding_mode_retention.sh # Per-source PLIP retention (FAST + FULL)
├── skill/ # Anthropic Agent Skill (PyPI-publishable)
│ ├── SKILL.md / README.md / pyproject.toml / setup.sh
│ └── syncraft/ examples/ resources/ tests/
├── src/ # Source code
│ ├── inference_enhanced.py # Gemini standard inference
│ ├── inference_bioactivity_constrain.py # Gemini interaction-aware
│ ├── inference_dsv4.py # DeepSeek-V4 standard (OPEN-WEIGHT)
│ ├── inference_dsv4_bioactivity.py # DeepSeek-V4 interaction-aware
│ ├── inference_edit_path.py # LLM-direct edit audit vs graph-matching ground truth
│ ├── inference_judge_rag_alignment.py # LLM-as-judge alignment scoring for RAG ablation
│ ├── plip_retention.py # Systematic binding-mode retention via PLIP
│ ├── utils.py # Core editing & reconstruction logic
│ ├── extract_interaction.py # PLIP interaction analysis
│ ├── docking_utils.py # Single-molecule Vina helpers
│ └── calc_metric_utils.py # Tanimoto / similarity utilities
└── vina/ # Pre-prepared receptor files (no Vina binary; pip install vina)
├── 7L12_* (SARS-CoV-2 Mpro)
├── 7YDX_* (RIPK1)
├── 2YAC_* (PLK1)
├── 5mo4_* + 8F2B_*
For experiment-by-experiment reproduction instructions, see docs/REPRODUCING_EXPERIMENTS.md.
- Python 3.10+
- AutoDock Vina (for interaction-aware mode)
- PLIP (for interaction-aware mode)
- OpenBabel
We strongly recommend creating an isolated conda environment, because
openbabel and rdkit ship as compiled extensions that pip cannot build
cleanly on most Linux distros:
conda create -n syncraft python=3.12 -y
conda activate syncraft
conda install -c conda-forge -y rdkit openbabel
pip install litellm loguru meeko gemmi tqdm numpy openai httpx pandas scipyWhy these are mandatory even though meeko/gemmi/scipy aren't declared
dependencies of each other:
gemmiis needed at runtime by recentmeekoreleases (PDB/CIF parsing) butmeekodoes not list it.scipyis needed at runtime bymeeko.MoleculePreparation.prepare()— the ligand-prep step every interaction-aware inference path goes through.
Standard-mode-only (no docking / no interaction-awareness): you can skip
openbabel, meeko, gemmi, and scipy.
Analysis notebooks (e.g. notebooks/calc_metric.ipynb)
additionally need syntheseus for parsing retrosynthesis route pickles:
pip install syntheseusTo run the Anthropic Agent Skill (skill/), additionally
pip install biopython pdb2pqr plip vina and follow skill/README.md.
SynCraft uses litellm to interface with LLMs (e.g., Gemini, DeepSeek). Set
only the API key for the backend you actually plan to use:
export GEMINI_API_KEY='your-gemini-api-key'
# or
export DEEPSEEK_API_KEY='your-deepseek-api-key'To run the standard optimization pipeline which focuses on restoring synthesizability using chemical reasoning:
cd scripts
bash inference_enhanced.shUnder the hood (src/inference_enhanced.py):
- Loads unsynthesizable molecules.
- Retrieves similar "golden examples" (pairs of unsynthesizable
$\to$ synthesizable molecules) for few-shot prompting. - Prompts the LLM to reason about synthetic liabilities and generate a JSON edit sequence.
- Applies the edits deterministically to produce the result.
Key Arguments:
--dataset: The dataset key in the input JSON.--model: The LLM model to use (e.g.,gemini/gemini-2.5-pro).--few-shot-k: Number of few-shot examples to use (default: 5).--pass-k: Number of parallel inference passes per molecule.
To optimize molecules while preserving binding interactions (requires Vina and receptor files):
cd scripts
bash inference_bioactivity_constrain.shUnder the hood (src/inference_bioactivity_constrain.py):
- Docks the input molecule into the target receptor.
- Analyzes interactions (H-bonds,
$\pi$ -stacking, etc.) using PLIP. - Injects these constraints into the LLM prompt (e.g., "Atom 5 forms a critical Hydrogen Bond...").
- The LLM generates edits that respect these biological constraints.
Configuration:
Ensure your receptor files (.pdbqt, .pdb, config.txt) are correctly placed in the vina/ directory and referenced in the script.
SynCraft defines a compact action space
DEL_ATOM: Removes a specific atom.MUTATE_ATOM: Changes the atomic element.ADD_ATOM: Introduces a new atom.ADD_BOND/DEL_BOND: Creates or removes bonds.CHANGE_BOND: Modifies bond order/aromaticity.SET_CHIRAL/SET_BOND_STEREO: Defines stereochemistry.
- Retrieval: Finds similar "Synthesis Cliff" examples.
- Reasoning: The LLM analyzes the molecule and articulates a plan.
- Execution: The plan (JSON) is executed by the deterministic toolkit (
src/utils.py).
For end users who want to invoke SynCraft as an Anthropic Agent Skill from
Claude / Gemini / generic OpenAI-compatible agents, see skill/:
from syncraft import edit_for_synthesizability
results = edit_for_synthesizability(
molecules=[ # SMILES from your gen model
"NC(=O)[C@@H]1CCC(C(=O)N[C@H]2OC(=O)[C@@H](O)[C@@H](O)[C@H]2NNC2C=CC=CC=C2Cl)C1",
# ...
],
protein_pdb="7L12", # SARS-CoV-2 Mpro
rescue_llm="deepseek-v4-pro", # MIT open-weight
pass_k=5,
)
for r in results:
print(r.original_smiles, "→", r.edited_smiles,
f"Vina {r.original_dock_score:.1f}→{r.edited_dock_score:.1f}")See skill/README.md and skill/SKILL.md.
See docs/REPRODUCING_EXPERIMENTS.md for
full instructions. Each experiment in the paper is covered by a
scripts/reproduce_*.sh wrapper:
| Experiment | Script |
|---|---|
| Main result (Table 1) | scripts/reproduce_table1.sh |
| RAG ablation | scripts/reproduce_rag_ablation.sh |
| Edit-path audit | scripts/reproduce_edit_path_audit.sh |
| Baseline coverage (LIGAN / AR / DiffSBDD) | scripts/reproduce_baseline_coverage.sh |
| Binding-mode retention | scripts/reproduce_binding_mode_retention.sh |