Boltzina

Boltzina is a pipeline that combines AutoDock Vina (or Uni-Dock2) docking with Boltz-2 structure prediction and affinity scoring for virtual screening.

Quick Start

Installation

# From PyPI
pip install boltzina

# From source (development)
uv sync

Tool setup (Vina, MAXIT, Boltz-2 model weights)

boltzina setup --all

For Uni-Dock2 (GPU-accelerated docking, requires pixi and CUDA 12):

# Clone Uni-Dock2 and build using the provided pixi.toml
git clone https://github.com/dptech-corp/Uni-Dock2 /path/to/Uni-Dock2
cp pixi.toml /path/to/Uni-Dock2/
cd /path/to/Uni-Dock2 && pixi install && pixi run build
boltzina setup --register-unidock2 /path/to/Uni-Dock2

Usage

With Boltz-2 structure prediction (sequence → dock → score)

Provide a protein sequence and a SMILES/SDF file. Boltzina will:

Run Boltz-2 structure + affinity prediction (complex with first/reference ligand)
Determine the docking grid automatically from the predicted binding pose
Run AutoDock Vina docking
Score all poses with Boltz-2

# From a FASTA file (CDK2 example)
boltzina run sample/CDK2/ligands.smi \
  --sequence-file sample/CDK2/cdk2.fasta \
  --output-dir ./results

# From a sequence string directly
boltzina run sample/CDK2/ligands.smi \
  --sequence "MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL" \
  --output-dir ./results

# Multi-chain protein: colon-separated sequences
boltzina run sample/CDK2/ligands.smi \
  --sequence "MENFQKVEKIGEGTYGVVYK...:AKLSILPWGHC..." \
  --output-dir ./results

# Multi-chain protein: multi-entry FASTA
boltzina run sample/CDK2/ligands.smi \
  --sequence-file complex.fasta \   # >chain1 / seq / >chain2 / seq
  --output-dir ./results

# Use a specific reference ligand for prediction and grid center
boltzina run sample/CDK2/ligands.smi \
  --sequence-file sample/CDK2/cdk2.fasta \
  --reference-ligand "CC(C)[C@H](CO)Nc1nc(Nc2ccc(C(=O)O)c(Cl)c2)c2ncn(C(C)C)c2n1" \
  --output-dir ./results

# With more diffusion samples for better accuracy
boltzina run sample/CDK2/ligands.smi \
  --sequence-file sample/CDK2/cdk2.fasta \
  --use-msa-server \
  --diffusion-samples 5 \
  --output-dir ./results

With a Boltz-2 YAML input

For full control over multi-chain proteins, ligand definitions, and Boltz-2 settings, use a boltz-compatible YAML file (see sample/CDK2/1ckp_cdk2.yaml for an example):

boltzina run sample/CDK2/ligands.smi \
  --yaml sample/CDK2/1ckp_cdk2.yaml \
  --output-dir ./results

The YAML format:

version: 1
sequences:
  - protein:
      id: A
      sequence: MENFQKVEKIGEGTYGVVYK...  # CDK2 sequence
  - ligand:
      id: B
      smiles: 'CC(C)[C@H](CO)Nc1nc(Nc2ccc(C(=O)O)c(Cl)c2)c2ncn(C(C)C)c2n1'
properties:
  - affinity:
      binder: B

Multiple protein chains are supported (add more - protein: entries). The properties.affinity.binder identifies the reference ligand for grid center determination.

From precomputed Boltz-2 results

If you have already run boltz predict, pass the output directory directly:

boltzina run sample/CDK2/ligands.smi \
  --work-dir sample/CDK2/boltz_results_base \
  --output-dir ./results

The grid center is determined automatically from the Boltz-2 predicted ligand position. You can override it explicitly:

boltzina run sample/CDK2/ligands.smi \
  --work-dir sample/CDK2/boltz_results_base \
  --grid-center "7.0,-4.9,7.5" \
  --output-dir ./results

CLI Reference

`boltzina run <INPUT> [OPTIONS]`

INPUT can be a .smi/.txt file (SMILES list), .sdf file, or a directory.

Protein input (choose one; required):

Option	Description
`--sequence` / `-s`	Protein sequence (single chain, or `SEQ1:SEQ2` for multi-chain)
`--sequence-file`	FASTA file (one `>entry` per chain for multi-chain)
`--yaml`	Boltz-2 compatible YAML (protein + ligand + affinity)
`--work-dir`	Existing Boltz-2 output directory (docking + scoring only)

Structure prediction options (with --sequence / --sequence-file):

Option	Default	Description
`--reference-ligand`	first in INPUT	SMILES string or SDF file for Boltz-2 complex prediction and grid center

Docking:

Option	Default	Description
`--grid-center`	auto	Docking box center `x,y,z`
`--grid-size`	`20.0`	Docking box size (Å)
`--ligand-chain-id`	`B`	Ligand chain in Boltz-2 prediction (rescore mode)
`--docking-engine`	`vina`	`vina` or `unidock2`
`--num-workers`	`1`	Parallel Vina workers
`--skip-docking`	off	Score existing poses only
`--regenerate-conformer`	off	Force 3D conformer regeneration for SDF

Boltz-2 prediction:

Option	Default	Description
`--use-msa-server`	off	Use online MMseqs2 MSA server
`--recycling-steps`	`3`	Boltz-2 recycling steps
`--sampling-steps`	`200`	Boltz-2 sampling steps
`--diffusion-samples`	`1`	Boltz-2 diffusion samples
`--use-potentials`	off	Boltz-2 inference-time potentials
`--subsample-msa`	off	Subsample MSA sequences
`--no-kernels`	off	Disable trifast kernels (older GPUs)
`--affinity-mw-correction`	off	MW correction to affinity

Output:

Option	Default	Description
`--output-dir` / `-o`	`./boltzina_results`	Output directory
`--batch-size`	`1`	Boltz-2 scoring batch size
`--seed`	—	Random seed
`--vina-override`	off	Rerun Vina even if results exist
`--boltz-override`	off	Rerun Boltz-2 scoring even if results exist
`--keep-intermediate-files`	off	Keep intermediate docking files

`boltzina prepare <INPUT> [OPTIONS]`

Convert SMILES/SDF to PDB + prepared_mols.pkl for use with run.py.

boltzina prepare ligands.smi --output-dir ./prepared
boltzina prepare ligands.sdf --output-dir ./prepared --regenerate-conformer

`boltzina grid <STRUCTURE_FILE> [OPTIONS]`

Compute the docking grid center from a ligand or complex file.

boltzina grid ligand.pdb --output vina_config.txt
boltzina grid complex.cif --chain B --output vina_config.txt

`boltzina setup [OPTIONS]`

Install and register external tools.

boltzina setup --all                          # Vina + MAXIT + Boltz-2 weights
boltzina setup --install-vina                 # Vina only
boltzina setup --install-maxit                # MAXIT only
boltzina setup --register-unidock2 /path/to/Uni-Dock2
boltzina setup --show                         # Show current config

Legacy usage (run.py)

The original run.py interface is fully supported:

python run.py sample/CDK2/config.json
python run.py sample/CDK2/config.json --use_kernels --num_workers 4

See sample/CDK2/config.json for the configuration file format.

Benchmark Dataset

The MF-PCBA benchmark dataset used in the paper is included in mf-pcba_test.zip. See the paper for details on the evaluation protocol.

Running Tests

# Unit tests (no GPU required)
uv run pytest tests/ --ignore=tests/test_integration.py -v

# Integration tests (requires GPU + Boltz-2 weights)
uv run pytest tests/test_integration.py -m gpu -v

Reference

Furui, K, & Ohue, M. Boltzina: Efficient and Accurate Virtual Screening via Docking-Guided Binding Prediction with Boltz-2. AI for Accelerated Materials Design - NeurIPS 2025. https://openreview.net/forum?id=OwtEQsd2hN

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
boltzina		boltzina
sample		sample
scripts		scripts
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml
run.py		run.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Boltzina

Quick Start

Installation

Tool setup (Vina, MAXIT, Boltz-2 model weights)

Usage

With Boltz-2 structure prediction (sequence → dock → score)

With a Boltz-2 YAML input

From precomputed Boltz-2 results

CLI Reference

`boltzina run <INPUT> [OPTIONS]`

`boltzina prepare <INPUT> [OPTIONS]`

`boltzina grid <STRUCTURE_FILE> [OPTIONS]`

`boltzina setup [OPTIONS]`

Legacy usage (run.py)

Benchmark Dataset

Running Tests

Reference

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Boltzina

Quick Start

Installation

Tool setup (Vina, MAXIT, Boltz-2 model weights)

Usage

With Boltz-2 structure prediction (sequence → dock → score)

With a Boltz-2 YAML input

From precomputed Boltz-2 results

CLI Reference

boltzina run <INPUT> [OPTIONS]

boltzina prepare <INPUT> [OPTIONS]

boltzina grid <STRUCTURE_FILE> [OPTIONS]

boltzina setup [OPTIONS]

Legacy usage (run.py)

Benchmark Dataset

Running Tests

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages

`boltzina run <INPUT> [OPTIONS]`

`boltzina prepare <INPUT> [OPTIONS]`

`boltzina grid <STRUCTURE_FILE> [OPTIONS]`

`boltzina setup [OPTIONS]`