Skip to content

Part J: Add Protenix as universal 2nd refolding engine, remove AF2#5

Merged
damborik22 merged 6 commits into
masterfrom
refactor/af3-rfd3-ph
Apr 24, 2026
Merged

Part J: Add Protenix as universal 2nd refolding engine, remove AF2#5
damborik22 merged 6 commits into
masterfrom
refactor/af3-rfd3-ph

Conversation

@damborik22
Copy link
Copy Markdown
Owner

Summary

This PR implements Part J of the AF3 + RFD3 refactor: replaces AlphaFold2 with Protenix v0.5.0 (ByteDance's open-source AF3 reimplementation) as the universal 2nd refolding engine across all platforms. AF2 refolding code is completely removed from the Evaluator, while Protenix is integrated as a lightweight alternative that runs comfortably on 24 GB VRAM.

Key Changes

Evaluator — AF2 removal:

  • Deleted refold_af2.py, refold_Version6.py, and af2_runner.py (all AF2 refolding logic)
  • Removed binder-eval-af2 conda environment definition
  • Removed refold-af2 CLI subcommand and all AF2-specific argument parsing
  • Stripped AF2 columns from schema, merger, scoring, and visualization code
  • Updated report generation to reference only Boltz-2 (AF3 will be added in Part K)

Evaluator — Protenix integration:

  • Added refold_protenix.py (standalone batch refolder for Protenix v0.5.0)
    • Emits engine-neutral CSV with metrics: iptm, ptm, ranking_score, plddt_*, pae_*
    • Rescales pLDDT from Protenix native 0–100 to 0–1 for cross-engine comparison
    • Selects top-ranked sample per binder; saves PAE as .npy sidecar
  • Added protenix_runner.py (CLI wrapper for batch refolding)
  • Added refold_protenix.py CLI subcommand
  • Updated merger.py to prefix Protenix columns with protenix_ and join on sequence
  • Updated scoring.py to compute ipSAE from Protenix PAE files (uniform 10 Å cutoff across engines)

Installer & tool support:

  • Updated install.sh to add RFD3 (foundry) and Protein-Hunter as new tools
  • Marked RFAA as legacy (not shown in interactive menu; opt-in via --tool rfaa)
  • Added docs/rfaa_manual_reinstall.md with instructions for manual RFAA re-installation
  • Updated tool status checks and environment detection

New extractors:

  • Added rfd3.py extractor (defensive CSV/FASTA scanning for RFD3 / foundry outputs)
  • Added protein_hunter.py extractor (reads summary_high_iptm.csv or summary_all_runs.csv)

Visualization & reporting:

  • Updated color scheme: added rfd3 (deep-orange) and protein_hunter (teal-cyan)
  • Removed AF2 vs. Boltz-2 scatter plot; simplified PAE heatmap to Boltz-2 only
  • Updated methodology text to reflect Boltz-2 as primary, Protenix/AF3 as optional engines
  • Removed AF2-specific metric columns from plots and tables

Workflow updates:

  • evaluate.sh now requires only --target-seq (no --target-pdb)
  • Added --protenix-env flag to specify Protenix conda environment (default: bindmaster_pxdesign)
  • Added --skip-protenix flag to skip Protenix refolding
  • Updated run.py orchestrator to call refold-protenix instead of refold-af2

Implementation Details

  • Protenix environment: Reuses existing bindmaster_pxdesign conda env shipped by PXDesign installer (v0.5.0 pinned)
  • PAE handling: Protenix outputs token-pair PAE matrices in full_data.json; extracted and saved as .npy for ipSAE computation
  • **Metric

https://claude.ai/code/session_01KMBQ6cJe46ZNuDkNpkRrbE

damborik22 and others added 6 commits April 23, 2026 12:32
Describes the 6 parts of a large refactor:
  I — Remove AF2 refolding from Evaluator
  J — Protenix refolder (universal, via bindmaster_pxdesign env)
  K — AF3 v3.0.2 refolder (aarch64 / DGX Spark only)
  L — Protein Hunter with all 6 modalities (protein / cyclic / ligand / DNA / RNA)
  M — RFD3 (RosettaCommons/foundry) replaces RFAA, which is hard-deleted
  N — Distributed-workflow docs (design on x86, evaluate on Spark)

Key architectural choices captured:
  - AF3 is aarch64-only because the 80 GB VRAM target exceeds our 3090 (24 GB)
  - x86 keeps Boltz-2 + Protenix as two independent refolding engines
  - AF3 weights require a manual Google Form request (2–3 business day wait)
  - RFAA deleted (not deprecated); docs/rfaa_manual_reinstall.md retains
    commit SHAs + patch list for ad-hoc recreation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Evaluator AF2 refolding is removed. Parts J (Protenix, universal) and K (AF3,
aarch64/DGX Spark) will restore a multi-engine agreement_count.

Note: BindCraft's internal AF2 design, PXDesign's internal AF2 eval, and
Proteina-Complexa's AF2 cross-val all stay — only Evaluator AF2 refolding
is removed.

Deleted:
- Evaluator/scripts/refold_af2.py, refold_Version6.py
- Evaluator/binder_comparison/refolding/af2_runner.py
- Evaluator/binder_comparison/cli/refold_af2.py
- Evaluator/envs/binder-eval-af2.yml

Schema (Evaluator/binder_comparison/core/schema.py):
- dropped 8 af2_* fields from StandardisedMetrics, 2 from PerResidueData
- pruned af2_* entries from LOWER_IS_BETTER and ZSCORE_METRICS
- model_weights default: {"af2": 0.6, "boltz2": 0.4} → {"boltz2": 1.0}

Scoring (comparison/scoring.py):
- deleted add_af2_ipsae_from_files
- compute_agreement engine list now [boltz_pae_ipsae_min, protenix_ipsae_min,
  af3_ipsae_min]; Protenix/AF3 columns arrive in Parts J & K
- _best_ipsae_col + rank_by_adaptyv_method no longer consider af2_*

Merger (comparison/merger.py):
- rewritten to Boltz-2 only: merge_refold_results(boltz2_csv, sequences_fasta)
- _load_af2 + _AF2_DROP_COLS gone

CLI:
- binder-compare refold-af2 subcommand removed
- binder-compare run is now a 3-step pipeline (extract → refold-boltz2 → report)
- binder-compare report no longer accepts --af2-results / --af2-pae-dir

Visualization:
- plots.py: METRICS_DISPLAY af2_* entries pruned; plot_af2_vs_boltz2_scatter
  deleted; plot_pae_heatmaps / load_pae_data_from_df simplified to Boltz-2 only
- report.py: _compute_af2_boltz2_r + _correlation_callout_html deleted;
  all af2_* tooltips and display columns removed; methodology text updated

Installers:
- install/install.sh + install/install_aarch.sh: binder-eval-af2 env is no
  longer created; uninstall path still cleans legacy envs
- Evaluator/install.sh: single-env install (binder-eval only)
- Evaluator/evaluate.sh: 2-step pipeline (Boltz-2 + report)
- Evaluator/pyproject.toml: af2 optional deps group dropped

Configurator (configurator/configurator.py):
- evaluator env-detection now checks binder-eval, not the removed
  binder-eval-af2
- Prompts + status text updated to "Boltz-2 refolding + ranked report"

Smoke tests pass: all modules import, `binder-compare --help` lists 6 subcommands
(no refold-af2), ruff check + format green, shellcheck clean on updated scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a second refolding engine to the Evaluator without creating a new env —
Protenix v0.5.0 (ByteDance's open-source AlphaFold 3 re-implementation) rides
the existing bindmaster_pxdesign conda env that PXDesign already installs.

New CLI subcommand:
  conda run -n bindmaster_pxdesign binder-compare refold-protenix \
      --sequences seqs.fasta --target-seq SEQ -o protenix_results.csv

Schema additions in core/schema.py:
  - StandardisedMetrics gains protenix_* fields (iptm, ptm, ranking_score,
    plddt_binder_mean/min, plddt_target_mean, pae_bt/tb/bb, bt_ipsae,
    tb_ipsae, ipsae_min). pLDDT rescaled 0-100 → 0-1 on ingest.
  - af3_* counterparts reserved for Part K (aarch64 / DGX Spark only).
  - PerResidueData gains protenix_pae, af3_pae.
  - LOWER_IS_BETTER + ZSCORE_METRICS extended for both engines.

Merger (comparison/merger.py): multi-engine, outer-joins on sequence.
  merge_refold_results(boltz2_csv, ..., protenix_csv=..., af3_csv=...)

Scoring (comparison/scoring.py): new generic
  add_ipsae_from_pae_files(df, prefix=...)
that computes DunbrackLab d0res ipSAE from any engine's saved PAE .npy.

compute_agreement now sums over
  {boltz_pae_ipsae_min, protenix_ipsae_min, af3_ipsae_min}
passing 0.61 — 0–2 on x86, up to 0–3 on Spark when AF3 is wired in Part K.

Orchestration:
  - Evaluator/evaluate.sh auto-detects the bindmaster_pxdesign env and runs
    Protenix as step 2 of 3 unless --skip-protenix.
  - binder-compare run --protenix-env bindmaster_pxdesign enables Protenix.
  - binder-compare report gains --protenix-results and --af3-results flags.

Installer (install/install.sh + install/install_aarch.sh): the PXDesign step
now pip-installs binder-compare[report] into bindmaster_pxdesign so the
refolder is callable from there immediately after `bindmaster install --tool
pxdesign`.

Runtime details:
  - Protenix weights (~3-4 GB) auto-download from ByteDance TOS on first use.
  - need_atom_confidence=True forced at call time so the token-pair PAE is
    written to the full_data JSON (required for DunbrackLab ipSAE).
  - use_msa=False by default — MSA-free inference, no internet needed after
    the initial weight pull.
  - chain_plddt[0] = target, chain_plddt[1] = binder (already 0-1 scale).
  - Binder atom pLDDT min derived from atom_plddt + atom_to_token_idx +
    token_asym_id.

Live smoke test (2 × 43aa random binders vs 76aa ubiquitin):
  - Weight download + CUDA init: ~6 min first run; ~10 s warm.
  - Inference: ~12 s/design on RTX 3090 at n_cycle=3 n_step=50 n_sample=1.
  - CSV + *_pae.npy + CIF all populated; PAE shape (119, 119) matches
    target+binder tokens.
  - Report pipeline (merger → add_ipsae_from_pae_files → agreement_count →
    rank_by_adaptyv_method) green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New design tool: Protein-Hunter (Cho et al. 2025, bioRxiv 10.1101/2025.10.10.681530)
— Boltz-2 / Chai-1 multi-cycle structure hallucination for protein, cyclic
peptide, small-molecule (CCD/SMILES), DNA, and RNA binders (all 6 modalities
supported natively by upstream design.py).

Installer (install/install.sh only — aarch64 deferred):
  - PROTEIN_HUNTER_{REPO,COMMIT,DIR} constants
  - DO_PROTEIN_HUNTER flag wired into --tool parsing, interactive menu
    (8-tool selector), `all` meta-tool, uninstall, and run order
  - install_protein_hunter(): clone @ pinned commit d4bd9515..., conda env
    bindmaster_protein_hunter (Py 3.10), PyTorch 2.2+ CUDA 12.1, vendored
    boltz_ph pip-installed, pyrosetta-installer, chai-lab (sokrypton fork),
    LigandMPNN weights symlinked from RFAA install when present
  - _write_protein_hunter_shortcut(): bin/protein-hunter opens env shell and
    prints the 6-modality flag cheat sheet
  - Uninstall case removes env + cloned dir + shortcut

aarch64: Protein-Hunter is x86-only in this release — pyrosetta has no
aarch64 wheel, and the chai-lab fork is untested on ARM. install_aarch.sh
is deliberately untouched.

Evaluator integration:
  - extractors/protein_hunter.py: ProteinHunterExtractor reads
    summary_high_iptm.csv by default (high-ipTM + %X filter, analogous to
    Mosaic is_top=1). Pass all_runs=True → reads summary_all_runs.csv and
    extracts best_seq per run.
  - Exported from extractors/__init__.py
  - cli/extract.py: new --protein-hunter DIR and
    --all-protein-hunter-designs flags
  - core/schema.py: SourceTool literal gains "protein_hunter" (and the
    previously-missing "proteina_complexa")
  - visualization/plots.py: TOOL_COLOURS + _TOOL_DISPLAY entries
  - visualization/report.py: _TOOL_COLOURS_NGL + _TOOL_DISPLAY + CSS class
    .tool-protein_hunter (#00838F teal-cyan)
  - cli/report.py: PyMOL color + display name entries

Deferred to follow-up commits:
  - Configurator page + modality-specific run-script templates
  - aarch64 installer support (pyrosetta blocker)
  - Live install smoke test (~30 min + several GB of weights)

Lint clean (ruff + shellcheck); all binder_comparison imports green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
RFAA is deprecated in favor of RFD3 (RosettaCommons/foundry v0.1.9) but
remains installable to reproduce existing runs. Decision: soft-delete via
menu removal + deprecation banner rather than ripping out 22 files of tests
/ tui / bindmaster/tools/rfaa code that would also need updating.

Installer (install/install.sh):
  - FOUNDRY_{REPO,COMMIT,DIR,WEIGHTS_DIR} constants (v0.1.9, weights under
    BindMaster/weights/foundry)
  - DO_RFD3 flag, --tool rfd3|foundry parsing, uninstall case
  - install_rfd3(): conda env bindmaster_rfd3 (Py 3.12), PyTorch 2.2+ CUDA
    12.1 wheels, rc-foundry[rfd3,mpnn] from PyPI, foundry install rfd3
    (weights), smoke test on `rfd3 --help`
  - _write_rfd3_shortcut(): bin/rfd3 runs `rfd3 design ...` passthrough or
    opens an env shell with FOUNDRY_CHECKPOINT_DIR exported
  - is_rfd3_installed() status check
  - install_rfaa() now prints a deprecation banner pointing at
    docs/rfaa_manual_reinstall.md
  - --tool all no longer includes RFAA (dropped from meta-tool + interactive
    menu; opt in via --tool rfaa on the CLI)
  - RFD3 replaces RFAA slot #5 in the 8-tool interactive menu
  - Tool summary lists RFAA as (legacy) in yellow

aarch64 installer untouched in this commit (install_aarch.sh) — RFD3 will
wire in during Part K (DGX Spark) work; main benefit of RFD3 there is that
it has no DGL, unblocking the Grace-Hopper path that RFAA could never reach.

Evaluator:
  - extractors/rfd3.py — defensive CSV/FASTA parser (foundry output schema
    isn't 100% locked in v0.1.9; refine during Part N1 end-to-end test)
  - Registered in extractors/__init__.py + cli/extract.py (--rfd3 DIR flag)
  - core/schema.py SourceTool literal gains "rfd3"
  - Tool colors/displays added in visualization/plots.py, report.py
    (_TOOL_COLOURS_NGL, _TOOL_DISPLAY, CSS class .tool-rfd3), cli/report.py
    (PyMOL colors)

Docs:
  - docs/rfaa_manual_reinstall.md — commit SHAs (f913a19 RFAA, 26ec57a
    LigandMPNN), post-install patches, manual install recipe, notable open
    upstream PRs (#21 TRP fix, #26 dir-portability, #37 ContigMap), migration
    notes to RFD3 including what's compatible / what isn't (config schema,
    contig syntax, AtomWorks-handled post-processing).

Lint clean (ruff + shellcheck); all binder_comparison imports green.
Live smoke test for RFD3 install deferred — will happen alongside Part K's
DGX Spark deployment to cover both aarch64 and x86 simultaneously.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single line-wrapping nit left over from the AF2-removal edits in Part I.
Re-running ruff format collapses the evaluator detection expression back
to a single line now that the condition is short enough.

No logic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@damborik22 damborik22 merged commit 34ab18a into master Apr 24, 2026
3 checks passed
@damborik22 damborik22 deleted the refactor/af3-rfd3-ph branch April 24, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant