diff --git a/CHANGELOG.md b/CHANGELOG.md
index 945f8d7..747f507 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -16,6 +16,38 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ### Changed
 - **BindCraft pin** `828fd9f` → `7cd4ace` (3 upstream bugfixes): graylab→west.rosettacommons.org PyRosetta wheels (x86_64), `range(11,15)→(11,16)` model-selection fix, stage-3 `onehot_plddt` init + `align_pdbs` crash guard
 
+### Added (Parts L + M — Protein-Hunter & RFD3, on `refactor/af3-rfd3-ph`)
+- **Part L — Protein-Hunter** (Cho et al. 2025) installable via `bindmaster install --tool protein-hunter` (x86 only; aarch64 blocked by pyrosetta). Conda env `bindmaster_protein_hunter` (Py 3.10), vendored Boltz-2 + LigandMPNN + Chai-1 (sokrypton fork), shortcut `bin/protein-hunter`. New Evaluator extractor reads `summary_high_iptm.csv` by default (`--all-protein-hunter-designs` for all runs). Supports all 6 modalities via upstream `design.py` flags (protein / cyclic / ligand-CCD / ligand-SMILES / DNA / RNA). `SourceTool` Literal + tool colors/displays extended.
+- **Part M — RFD3 (RosettaCommons/foundry v0.1.9)** installable via `bindmaster install --tool rfd3`. Conda env `bindmaster_rfd3` (Py 3.12), `rc-foundry[rfd3,mpnn]` from PyPI, weights at `BindMaster/weights/foundry/`. BSD-3-Clause, commercial-use OK, works on aarch64 (no DGL). Shortcut `bin/rfd3` runs `rfd3 design ...` or opens an env shell. New `RFD3Extractor` with defensive CSV/FASTA parsing. Tool colors/displays added.
+- **RFAA deprecated (not deleted)**. Dropped from interactive menu and from the `--tool all` meta-tool. Still installable via `bindmaster install --tool rfaa` for reproducing existing runs. `install_rfaa()` now prints a deprecation banner pointing at RFD3 and `docs/rfaa_manual_reinstall.md`.
+- **New doc** `docs/rfaa_manual_reinstall.md` captures commit SHAs, post-install patches, and manual-reproducibility steps for long-term RFAA maintenance.
+
+### Added (Part J — Protenix refolder, on `refactor/af3-rfd3-ph`)
+- **Protenix v0.5.0 as universal 2nd refolding engine** — ByteDance's open-source AlphaFold 3 reimplementation (~3-4 GB weights auto-downloaded from ByteDance TOS, runs comfortably on 24 GB GPUs).
+- New CLI: `binder-compare refold-protenix` — runs inside the existing `bindmaster_pxdesign` conda env (no new env needed).
+- New files: `Evaluator/scripts/refold_protenix.py`, `Evaluator/binder_comparison/refolding/protenix_runner.py`, `Evaluator/binder_comparison/cli/refold_protenix.py`.
+- Schema: `protenix_*` columns in `StandardisedMetrics` (iptm, ptm, ranking_score, plddt_binder_mean/min, plddt_target_mean, pae_bt/tb/bb, bt_ipsae, tb_ipsae, ipsae_min). `af3_*` counterparts also reserved for Part K. pLDDT rescaled 0-100 → 0-1 on ingest.
+- Scoring: new generic `add_ipsae_from_pae_files(df, prefix=...)` for any engine's saved PAE matrix.
+- Merger: multi-engine support — `merge_refold_results(boltz2_csv, ..., protenix_csv=..., af3_csv=...)`. Accepts any combination; outer-joins on `sequence`.
+- `compute_agreement` now sums {boltz_pae_ipsae_min, protenix_ipsae_min, af3_ipsae_min} passing the 0.61 threshold (0–3 on Spark, 0–2 on x86).
+- Orchestration:
+  - `Evaluator/evaluate.sh` auto-detects `bindmaster_pxdesign`; Protenix step runs between Boltz-2 and report unless `--skip-protenix` or env missing.
+  - `binder-compare run --protenix-env bindmaster_pxdesign` enables Protenix; omit to skip.
+  - `binder-compare report` gains `--protenix-results` and `--af3-results`.
+- Installer: PXDesign step now pip-installs `binder-compare` into `bindmaster_pxdesign` env so Protenix refolding is available after `bindmaster install --tool pxdesign`.
+- **Live smoke test passed** — 2 × 43aa random binders against 76aa ubiquitin target: inference ~12 s/design on RTX 3090, CSV + `*_pae.npy` populated, token-pair PAE extracted via `need_atom_confidence=True`, DunbrackLab ipSAE computed downstream in the report.
+
+### Removed (Part I — AF2 refolding removal, on `refactor/af3-rfd3-ph`)
+- Evaluator AF2 refolding is gone. This is step 1 of the AF3/Protenix refactor; AF3 (aarch64-only, DGX Spark) and Protenix (universal) will provide the second engine in Parts J & K.
+- Deleted files: `Evaluator/scripts/refold_af2.py`, `Evaluator/scripts/refold_Version6.py`, `Evaluator/binder_comparison/refolding/af2_runner.py`, `Evaluator/binder_comparison/cli/refold_af2.py`, `Evaluator/envs/binder-eval-af2.yml`
+- Installer no longer creates `binder-eval-af2` conda env (uninstall path still cleans legacy installs)
+- Schema: removed 8 `af2_*` fields from `StandardisedMetrics`, 2 from `PerResidueData`; pruned `af2_*` entries from `LOWER_IS_BETTER`, `ZSCORE_METRICS`; `model_weights` default now `{"boltz2": 1.0}`
+- Scoring: deleted `add_af2_ipsae_from_files`; `compute_agreement` engine list now `[boltz_pae_ipsae_min, protenix_ipsae_min, af3_ipsae_min]` (Protenix/AF3 columns land in Parts J & K)
+- Merger: `merge_refold_results(boltz2_csv, sequences_fasta)` (dropped `af2_csv` param)
+- Report & plots: removed `_compute_af2_boltz2_r`, `_correlation_callout_html`, `plot_af2_vs_boltz2_scatter`; pruned all `af2_*` columns from display lists and tooltip map
+- Evaluator orchestration: `evaluate.sh` is now 2-step (Boltz-2 + report); `binder-compare run` is 3-step (extract + refold-boltz2 + report)
+- BindCraft's internal AF2 design path, PXDesign's internal AF2 eval, and Proteina-Complexa's AF2 cross-val **all stay** — only Evaluator AF2 refolding was removed
+
 ### Fixed
 - Configurator `ask_choice()` return value destructuring for PXDesign mode and preset selection
 - RFAA template: Python 3.12 f-string syntax replaced with 3.10-compatible `ligand_line` variable
diff --git a/CLAUDE.md b/CLAUDE.md
index edb18d6..690658a 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -26,8 +26,9 @@ Target structure (.pdb / .mmcif)
     → Evaluator:
        1. Extract sequences from all tool outputs
        2. Refold with Boltz-2 (Mosaic venv)
-       3. Refold with AF2 (ColabDesign)
-       4. Rank, score, and generate HTML report
+       3. (x86) Refold with Protenix (bindmaster_pxdesign env)  [Part J, in progress]
+       4. (aarch64 / DGX Spark) Refold with AlphaFold 3 v3.0.2 (binder-eval-af3)  [Part K, in progress]
+       5. Rank, score, and generate HTML report
 ```
 
 ### Directory layout
@@ -46,9 +47,9 @@ BindMaster/
 │   └── evaluator.py           ← lightweight evaluator (Mosaic venv, ~780 lines)
 ├── Evaluator/                 ← bundled full evaluation pipeline package
 │   ├── binder_comparison/     ← core Python package (extractors, refolding, scoring, viz)
-│   ├── scripts/               ← standalone refold scripts (refold_boltz2.py, refold_af2.py)
+│   ├── scripts/               ← standalone refold scripts (refold_boltz2.py, refold_protenix.py [todo], refold_af3.py [todo])
 │   ├── evaluate.sh            ← shell orchestrator for full 4-step pipeline
-│   ├── envs/                  ← conda env specs (binder-eval.yml, binder-eval-af2.yml)
+│   ├── envs/                  ← conda env specs (binder-eval.yml, binder-eval-af3.yml [aarch64 only, todo])
 │   ├── docs/                  ← pipeline_reference.md (metrics, known issues)
 │   └── pyproject.toml         ← package: "binder-comparison" v0.1.0
 ├── bindmaster_examples/
@@ -85,7 +86,6 @@ Each tool runs in its own isolated environment. **Never mix packages across envi
 | `bindmaster_pxdesign` | PXDesign | 3.11 | conda | Protenix binder design + eval |
 | `Proteina-Complexa/.venv` | Proteina-Complexa | 3.12 | uv | Flow matching + test-time compute binder design |
 | `binder-eval` | Evaluator | 3.10 | conda | Sequence extraction + reporting |
-| `binder-eval-af2` | Evaluator | 3.10 | conda | AF2 refolding via ColabDesign |
 
 The `bindmaster.py` CLI dispatcher uses `os.execv()` to launch sub-commands in their correct environment — `install` runs in bash, `configure` runs in system Python, `evaluate` runs in the Mosaic `.venv` Python.
 
@@ -110,7 +110,7 @@ In **standalone mode** (`--standalone` or auto-detected), all conda environments
 - **stdlib-only CLI:** `bindmaster.py` uses only stdlib so it works on any Python 3.10+ without pip installs.
 - **uv for Mosaic:** Mosaic uses `uv` instead of conda because it needs JAX with CUDA, and uv resolves this faster and more reliably.
 - **Pinned commits:** Tool repos are cloned at pinned commits (`BINDCRAFT_COMMIT`, `BOLTZGEN_COMMIT`, `MOSAIC_COMMIT`) for reproducible installs.
-- **Separate evaluator envs:** Boltz-2 refolding needs JAX (Mosaic venv), AF2 refolding needs ColabDesign (conda). These conflict, so they run in separate environments orchestrated by `evaluate.sh`.
+- **Separate evaluator envs:** Boltz-2 refolding runs in the Mosaic venv (JAX). The new Protenix refolder (Part J) rides the existing `bindmaster_pxdesign` conda env. AF3 (Part K) on DGX Spark gets its own `binder-eval-af3` env. `evaluate.sh` orchestrates all three.
 
 ---
 
@@ -147,7 +147,7 @@ In **standalone mode** (`--standalone` or auto-detected), all conda environments
 - Python classes: PascalCase
 - Python variables/functions: snake_case
 - Bash constants: UPPER_CASE
-- Conda envs: BindCraft, BoltzGen, binder-eval, binder-eval-af2
+- Conda envs: BindCraft, BoltzGen, binder-eval, bindmaster_pxdesign, bindmaster_rfaa (legacy — being replaced by bindmaster_rfd3)
 
 ### Git and branching
 
@@ -190,7 +190,7 @@ In **standalone mode** (`--standalone` or auto-detected), all conda environments
 
 ### Evaluation metrics and ranking
 
-**Primary metric: `ipsae_min`** — the minimum of binder→target and target→binder iPSAE scores. Computed from PAE arrays using the DunbrackLab 2025 formula: `max_i[mean_j(1/(1+(PAE_ij/d0)²))]` (d0_res variant, uniform 10 Å PAE cutoff for both Boltz-2 and AF2). Ranking uses agreement_count (how many engines agree ipsae_min > 0.61) as primary sort, then ipsae_min desc.
+**Primary metric: `ipsae_min`** — the minimum of binder→target and target→binder iPSAE scores. Computed from PAE arrays using the DunbrackLab 2025 formula: `max_i[mean_j(1/(1+(PAE_ij/d0)²))]` (d0_res variant, uniform 10 Å PAE cutoff across all engines). Ranking uses agreement_count (how many engines agree ipsae_min > 0.61) as primary sort, then ipsae_min desc.
 
 **Direction guide:**
 - **Higher is better:** `iptm`, `bt_ipsae`, `tb_ipsae`, `ipsae_min`, `plddt_binder_mean`, `binder_ptm`
@@ -208,13 +208,13 @@ In **standalone mode** (`--standalone` or auto-detected), all conda environments
 ### Critical domain facts
 
 - **iptm is gameable** — AF2-designed sequences (BindCraft) tend to score high on ipTM by construction. Use `ipsae_min` as the primary ranking metric instead.
-- **AF2 vs Boltz-2 disagreement** — For short binders (~60aa), Boltz-2 may score high while AF2 scores low. This is meaningful signal, not noise. The `agreement_count` column reflects how many engines agree above the 0.61 threshold.
+- **Engine disagreement is signal, not noise** — For short binders (~60aa), different refolding engines often disagree on interface quality. The `agreement_count` column reflects how many engines pass the 0.61 threshold; higher = stronger candidate.
 - **Binder length is a main driver** — Longer binders tend to score lower on `ipsae_min` (r ≈ -0.78).
 - **Mosaic designs.csv format** — Can mix column formats between workers (old 11-col / new 13-col). The parser must handle this carefully or columns misalign. The `is_top` column marks the ~40 refolded designs out of ~800 total; extractors filter to `is_top=1` by default.
 - **Mosaic `target_sequence` placeholder** — The Mosaic template (`hallucinate_bindmaster.py`) writes `"REPLACE_ME"` as `target_sequence` when not configured. The legacy evaluator guards against using this as a real target sequence.
-- **AF2 pLDDT scale** — ColabDesign `get_plddt()` returns values in [0,1], not [0,100].
-- **PAE ordering** — Boltz-2: [binder|target]; AF2: [target|binder]. Column prefixes distinguish them (`boltz_pae_*` vs `af2_*`).
-- **Append-mode CSVs** — Both `refold_boltz2.py` and `refold_af2.py` append to CSV. If rerun after partial failure, check for duplicate `run_id` entries.
+- **pLDDT scale** — Boltz-2 returns [0,1]; AF3 native is [0,100] and is rescaled to [0,1] on ingest by the refold runner so report columns are directly comparable.
+- **PAE ordering** — Boltz-2 is native [binder|target]. AF3 is token-order so we always put target first in the input JSON, giving [target|binder] — the evaluator transposes internally. Column prefixes distinguish engines (`boltz_pae_*`, `protenix_*`, `af3_*`).
+- **Append-mode CSVs** — `refold_boltz2.py` appends to CSV. If rerun after partial failure, check for duplicate `run_id` entries.
 
 ### Lab-specific information
 
@@ -350,12 +350,8 @@ conda run -n binder-eval binder-compare extract \
 Mosaic/.venv/bin/binder-compare refold-boltz2 \
     --sequences seqs.fasta --target-seq SEQ -o boltz2.csv
 
-# Refold with AF2
-conda run -n binder-eval-af2 binder-compare refold-af2 \
-    --sequences seqs.fasta --target-pdb PDB -o af2.csv
-
-# Generate report
+# Generate report (Boltz-2 only for now; Protenix / AF3 land in Parts J & K)
 conda run -n binder-eval binder-compare report \
-    --boltz2-results boltz2.csv --af2-results af2.csv \
+    --boltz2-results boltz2.csv \
     --sequences seqs.fasta -o ./report
 ```
diff --git a/Evaluator/binder_comparison/__init__.py b/Evaluator/binder_comparison/__init__.py
index d0384be..686dad1 100644
--- a/Evaluator/binder_comparison/__init__.py
+++ b/Evaluator/binder_comparison/__init__.py
@@ -1,7 +1,8 @@
 """Binder Design Comparison Tool.
 
-Compare binder sequences from BindCraft, BoltzGen, and Mosaic using
-standardised refolding with both AF2 and Boltz2, then ensemble the results.
+Compare binder sequences from BindCraft, BoltzGen, Mosaic, PXDesign,
+Proteina-Complexa, and Protein Hunter using Boltz-2 standardised refolding
+(plus Protenix on x86 and AF3 on aarch64/DGX Spark).
 """
 
 __version__ = "0.1.0"
diff --git a/Evaluator/binder_comparison/cli/__init__.py b/Evaluator/binder_comparison/cli/__init__.py
index 394a089..5b84928 100644
--- a/Evaluator/binder_comparison/cli/__init__.py
+++ b/Evaluator/binder_comparison/cli/__init__.py
@@ -1,3 +1,3 @@
-from . import extract, refold_af2, refold_boltz2, report, run, validate
+from . import extract, parse_seqs, refold_boltz2, refold_protenix, report, run, validate
 
-__all__ = ["extract", "refold_af2", "refold_boltz2", "report", "run", "validate"]
+__all__ = ["extract", "parse_seqs", "refold_boltz2", "refold_protenix", "report", "run", "validate"]
diff --git a/Evaluator/binder_comparison/cli/extract.py b/Evaluator/binder_comparison/cli/extract.py
index fef91f3..065ebe2 100644
--- a/Evaluator/binder_comparison/cli/extract.py
+++ b/Evaluator/binder_comparison/cli/extract.py
@@ -20,8 +20,10 @@
     BoltzGenExtractor,
     MosaicExtractor,
     ProteinaComplexaExtractor,
+    ProteinHunterExtractor,
     PXDesignExtractor,
     RFAAExtractor,
+    RFD3Extractor,
 )
 from ..io.write import write_fasta
 
@@ -60,12 +62,25 @@ def run(args: argparse.Namespace) -> None:
         print(f"  → {len(extracted)} sequences")
         all_binders.extend(extracted)
 
+    if args.rfd3:
+        print(f"[extract] RFD3: {args.rfd3}")
+        extracted = RFD3Extractor().extract(args.rfd3)
+        print(f"  → {len(extracted)} sequences")
+        all_binders.extend(extracted)
+
     if args.proteina_complexa:
         print(f"[extract] Proteina-Complexa: {args.proteina_complexa}")
         extracted = ProteinaComplexaExtractor().extract(args.proteina_complexa)
         print(f"  → {len(extracted)} sequences")
         all_binders.extend(extracted)
 
+    if args.protein_hunter:
+        print(f"[extract] Protein-Hunter: {args.protein_hunter}")
+        all_runs = getattr(args, "all_protein_hunter_designs", False)
+        extracted = ProteinHunterExtractor(all_runs=all_runs).extract(args.protein_hunter)
+        print(f"  → {len(extracted)} sequences")
+        all_binders.extend(extracted)
+
     if not all_binders:
         print("[extract] ERROR: no binders found. Check input directories.", file=sys.stderr)
         sys.exit(1)
@@ -107,13 +122,20 @@ def add_parser(subparsers) -> None:
     p.add_argument("--boltzgen", metavar="DIR", help="BoltzGen output directory")
     p.add_argument("--mosaic", metavar="DIR", help="Mosaic output directory (containing designs.csv)")
     p.add_argument("--pxdesign", metavar="DIR", help="PXDesign output directory (containing summary.csv)")
-    p.add_argument("--rfaa", metavar="DIR", help="RFAA output directory (containing sequences.csv)")
+    p.add_argument("--rfaa", metavar="DIR", help="RFAA output directory (legacy — RFD3 preferred)")
+    p.add_argument("--rfd3", metavar="DIR", help="RFD3 / foundry output directory (replaces RFAA)")
     p.add_argument(
         "--proteina-complexa",
         metavar="DIR",
         dest="proteina_complexa",
         help="Proteina-Complexa output directory (containing sequences.csv)",
     )
+    p.add_argument(
+        "--protein-hunter",
+        metavar="DIR",
+        dest="protein_hunter",
+        help="Protein-Hunter output directory (containing summary_high_iptm.csv)",
+    )
     p.add_argument("--output", "-o", required=True, metavar="FILE", help="Output FASTA path (e.g. sequences.fasta)")
     p.add_argument("--keep-duplicates", action="store_true", help="Do not deduplicate identical sequences across tools")
     p.add_argument(
@@ -121,4 +143,9 @@ def add_parser(subparsers) -> None:
         action="store_true",
         help="Include all Mosaic designs (default: only is_top=1 refolded designs)",
     )
+    p.add_argument(
+        "--all-protein-hunter-designs",
+        action="store_true",
+        help="Include all Protein-Hunter designs (default: only summary_high_iptm.csv rows)",
+    )
     p.set_defaults(func=run)
diff --git a/Evaluator/binder_comparison/cli/refold_af2.py b/Evaluator/binder_comparison/cli/refold_af2.py
deleted file mode 100644
index 329f66b..0000000
--- a/Evaluator/binder_comparison/cli/refold_af2.py
+++ /dev/null
@@ -1,73 +0,0 @@
-"""CLI subcommand: binder-compare refold-af2
-
-Refold sequences from a FASTA file using AlphaFold2 (ColabDesign).
-Run this in the 'bindcraft_pr' conda environment.
-
-Usage:
-    conda run -n bindcraft_pr binder-compare refold-af2 \\
-        --sequences sequences.fasta \\
-        --target-pdb target.pdb \\
-        --output af2_results.csv \\
-        --output-dir ./refold_af2/
-"""
-
-from __future__ import annotations
-
-import argparse
-import sys
-
-from ..io.read import read_fasta
-from ..refolding.af2_runner import run_af2_refold
-
-
-def run(args: argparse.Namespace) -> None:
-    entries = read_fasta(args.sequences)
-    if not entries:
-        print(f"[refold-af2] ERROR: no sequences found in {args.sequences}", file=sys.stderr)
-        sys.exit(1)
-
-    sequences = [seq for _, seq in entries]
-    print(f"[refold-af2] Loaded {len(sequences)} sequences from {args.sequences}")
-    print(f"[refold-af2] Target PDB: {args.target_pdb}")
-
-    models = [int(m) for m in args.models.split(",") if m.strip()]
-
-    run_af2_refold(
-        sequences=sequences,
-        target_pdb_path=args.target_pdb,
-        output_dir=args.output_dir,
-        output_csv=args.output,
-        models=models,
-        num_recycles=args.num_recycles,
-        mosaic_path=args.mosaic_path,
-        resume=args.resume,
-    )
-
-
-def add_parser(subparsers) -> None:
-    p = subparsers.add_parser(
-        "refold-af2",
-        help="Refold sequences with AF2 (run in 'bindcraft_pr' conda env).",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        description=__doc__,
-    )
-    p.add_argument(
-        "--sequences", "-s", required=True, metavar="FASTA", help="Input FASTA (e.g. from 'binder-compare extract')"
-    )
-    p.add_argument("--target-pdb", required=True, metavar="PDB", help="Target PDB file path (chain A is used)")
-    p.add_argument(
-        "--output", "-o", required=True, metavar="CSV", help="Output CSV path for metrics (e.g. af2_results.csv)"
-    )
-    p.add_argument(
-        "--output-dir",
-        default="./refold_af2",
-        metavar="DIR",
-        help="Directory for structure PDB files (default: ./refold_af2)",
-    )
-    p.add_argument("--models", default="0", metavar="N[,N]", help="Comma-separated AF2 model indices (default: 0)")
-    p.add_argument("--num-recycles", type=int, default=3, metavar="N", help="AF2 recycling iterations (default: 3)")
-    p.add_argument(
-        "--mosaic-path", default=None, metavar="DIR", help="Path to Mosaic repo root (auto-detected if not set)"
-    )
-    p.add_argument("--resume", action="store_true", help="Skip binders already present in existing output CSV")
-    p.set_defaults(func=run)
diff --git a/Evaluator/binder_comparison/cli/refold_protenix.py b/Evaluator/binder_comparison/cli/refold_protenix.py
new file mode 100644
index 0000000..c54cf3f
--- /dev/null
+++ b/Evaluator/binder_comparison/cli/refold_protenix.py
@@ -0,0 +1,120 @@
+"""CLI subcommand: binder-compare refold-protenix
+
+Refold sequences from a FASTA file using Protenix v0.5.0.
+Run this inside the ``bindmaster_pxdesign`` conda env.
+
+Usage:
+    conda run -n bindmaster_pxdesign binder-compare refold-protenix \\
+        --sequences sequences.fasta \\
+        --target-seq "MKTAYIAKQRQ..." \\
+        --output protenix_results.csv \\
+        --output-dir ./refold_protenix/
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+
+from ..io.read import read_fasta
+from ..refolding.protenix_runner import run_protenix_refold
+
+
+def run(args: argparse.Namespace) -> None:
+    entries = read_fasta(args.sequences)
+    if not entries:
+        print(f"[refold-protenix] ERROR: no sequences found in {args.sequences}", file=sys.stderr)
+        sys.exit(1)
+
+    sequences = [seq for _, seq in entries]
+    print(f"[refold-protenix] Loaded {len(sequences)} sequences from {args.sequences}")
+    print(f"[refold-protenix] Target length: {len(args.target_seq)} aa")
+    print(
+        f"[refold-protenix] {args.num_samples} samples × {args.num_seeds} seed(s), "
+        f"use_msa={args.use_msa}, n_cycle={args.n_cycle}, n_step={args.n_step}"
+    )
+
+    run_protenix_refold(
+        sequences=sequences,
+        target_sequence=args.target_seq,
+        output_dir=args.output_dir,
+        output_csv=args.output,
+        num_samples=args.num_samples,
+        num_seeds=args.num_seeds,
+        use_msa=args.use_msa,
+        n_cycle=args.n_cycle,
+        n_step=args.n_step,
+        scripts_path=args.scripts_path,
+        resume=args.resume,
+    )
+
+
+def add_parser(subparsers) -> None:
+    p = subparsers.add_parser(
+        "refold-protenix",
+        help="Refold sequences with Protenix v0.5.0 (run in 'bindmaster_pxdesign' conda env).",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        description=__doc__,
+    )
+    p.add_argument(
+        "--sequences",
+        "-s",
+        required=True,
+        metavar="FASTA",
+        help="Input FASTA (e.g. from 'binder-compare parse-seqs')",
+    )
+    p.add_argument("--target-seq", required=True, metavar="SEQ", help="Target protein sequence (amino acid string)")
+    p.add_argument(
+        "--output",
+        "-o",
+        required=True,
+        metavar="CSV",
+        help="Output CSV path for metrics (e.g. protenix_results.csv)",
+    )
+    p.add_argument(
+        "--output-dir",
+        default="./refold_protenix",
+        metavar="DIR",
+        help="Directory for structure files (default: ./refold_protenix)",
+    )
+    p.add_argument(
+        "--num-samples",
+        type=int,
+        default=5,
+        metavar="N",
+        help="Protenix diffusion samples per seed (default: 5)",
+    )
+    p.add_argument(
+        "--num-seeds",
+        type=int,
+        default=1,
+        metavar="N",
+        help="Number of random seeds (starts at 101; default: 1)",
+    )
+    p.add_argument(
+        "--use-msa",
+        action="store_true",
+        help="Run ColabFold MMseqs2 MSA pipeline (slower; default is MSA-free)",
+    )
+    p.add_argument(
+        "--n-cycle",
+        type=int,
+        default=10,
+        metavar="N",
+        help="Evoformer recycling iterations (default: 10)",
+    )
+    p.add_argument(
+        "--n-step",
+        type=int,
+        default=200,
+        metavar="N",
+        help="Diffusion steps per sample (default: 200)",
+    )
+    p.add_argument(
+        "--scripts-path",
+        default=None,
+        metavar="DIR",
+        help="Path to scripts/ directory (auto-detected if not set)",
+    )
+    p.add_argument("--resume", action="store_true", help="Skip binders already present in existing output CSV")
+    p.set_defaults(func=run)
diff --git a/Evaluator/binder_comparison/cli/report.py b/Evaluator/binder_comparison/cli/report.py
index 15522c1..f5910e7 100644
--- a/Evaluator/binder_comparison/cli/report.py
+++ b/Evaluator/binder_comparison/cli/report.py
@@ -1,12 +1,12 @@
 """CLI subcommand: binder-compare report
 
-Merge Boltz2 and AF2 refolding results, promote Boltz-2 as the primary
-predictor, z-score normalise, and generate the comparison report.
+Load Boltz-2 refolding results (plus Protenix/AF3 when available in future
+parts), promote Boltz-2 as the primary predictor, z-score normalise, and
+generate the comparison report.
 
 Usage:
     binder-compare report \\
         --boltz2-results boltz2_results.csv \\
-        --af2-results    af2_results.csv \\
         --sequences      sequences.fasta \\
         --output         ./comparison_report
 """
@@ -21,8 +21,8 @@
 from ..comparison.ensemble import compute_ensemble_metrics
 from ..comparison.merger import merge_refold_results
 from ..comparison.scoring import (
-    add_af2_ipsae_from_files,
     add_boltz_ipsae_from_files,
+    add_ipsae_from_pae_files,
     add_iptm_from_pae_files,
     apply_screening_thresholds,
     compute_agreement,
@@ -38,12 +38,13 @@ def run(args: argparse.Namespace) -> None:
     output_dir = Path(args.output)
     output_dir.mkdir(parents=True, exist_ok=True)
 
-    # Step 1: Merge
-    print("[report] Merging refolding results…")
+    # Step 1: Load refolding results
+    print("[report] Loading refolding results…")
     df = merge_refold_results(
         boltz2_csv=args.boltz2_results,
-        af2_csv=args.af2_results,
         sequences_fasta=args.sequences,
+        protenix_csv=args.protenix_results,
+        af3_csv=args.af3_results,
     )
 
     # Attach native metrics from BindCraft CSV if provided
@@ -55,11 +56,10 @@ def run(args: argparse.Namespace) -> None:
     df = compute_ensemble_metrics(df)
 
     # Step 2b: Compute ipSAE from PAE files using DunbrackLab formula.
-    # Uniform 10 Å cutoff for both engines so scores are directly comparable.
+    # Uniform 10 Å cutoff across engines so scores are directly comparable.
     # base_dir helps resolve relative PAE paths in older CSVs where the runner
     # didn't write absolute paths.  The CSV's parent dir is the best guess.
     boltz_base = Path(args.boltz2_results).resolve().parent if args.boltz2_results else None
-    af2_base = Path(args.af2_results).resolve().parent if args.af2_results else None
 
     if "boltz_pae_file" in df.columns:
         print("[report] Computing Boltz-2 ipSAE from PAE files (DunbrackLab, cutoff=10 Å)…")
@@ -69,22 +69,36 @@ def run(args: argparse.Namespace) -> None:
             df, pae_file_col="boltz_pae_file", ordering="binder_target", prefix="boltz", base_dir=boltz_base
         )
 
-    if "af2_pae_file" in df.columns:
-        print("[report] Computing AF2 ipSAE from PAE files (DunbrackLab, cutoff=10 Å)…")
-        df = add_af2_ipsae_from_files(df, pae_file_col="af2_pae_file", base_dir=af2_base)
-        print("[report] Computing AF2 ipTM from PAE files…")
+    # Protenix: DunbrackLab ipSAE + independent ipTM from the saved PAE matrix
+    protenix_base = Path(args.protenix_results).resolve().parent if args.protenix_results else None
+    if "protenix_pae_file" in df.columns:
+        print("[report] Computing Protenix ipSAE from PAE files (DunbrackLab, cutoff=10 Å)…")
+        df = add_ipsae_from_pae_files(
+            df,
+            pae_file_col="protenix_pae_file",
+            prefix="protenix",
+            ordering="target_binder",
+            base_dir=protenix_base,
+        )
+        df = add_iptm_from_pae_files(
+            df, pae_file_col="protenix_pae_file", ordering="target_binder", prefix="protenix", base_dir=protenix_base
+        )
+
+    # AF3 (aarch64 / DGX Spark): identical treatment — Part K wires this up end-to-end.
+    af3_base = Path(args.af3_results).resolve().parent if args.af3_results else None
+    if "af3_pae_file" in df.columns:
+        print("[report] Computing AF3 ipSAE from PAE files (DunbrackLab, cutoff=10 Å)…")
+        df = add_ipsae_from_pae_files(
+            df, pae_file_col="af3_pae_file", prefix="af3", ordering="target_binder", base_dir=af3_base
+        )
         df = add_iptm_from_pae_files(
-            df, pae_file_col="af2_pae_file", ordering="target_binder", prefix="af2", base_dir=af2_base
+            df, pae_file_col="af3_pae_file", ordering="target_binder", prefix="af3", base_dir=af3_base
         )
 
-    # Promote DunbrackLab PAE-based ipsae_min as the primary ranking column.
-    # Prefer Boltz-2 PAE-based; fall back to AF2 PAE-based.
+    # Promote Boltz-2 DunbrackLab PAE-based ipsae_min as the primary ranking column.
     if "boltz_pae_ipsae_min" in df.columns:
         df["ipsae_min"] = df["boltz_pae_ipsae_min"]
         print("[report] Using boltz_pae_ipsae_min as primary ipsae_min for ranking")
-    elif "af2_ipsae_min" in df.columns:
-        df["ipsae_min"] = df["af2_ipsae_min"]
-        print("[report] Using af2_ipsae_min as primary ipsae_min for ranking")
 
     # Step 3: Statistics + z-scores
     print("[report] Computing statistics…")
@@ -120,8 +134,6 @@ def run(args: argparse.Namespace) -> None:
         "boltz_pae_iptm",
         "boltz_pae_bt_ipsae",
         "boltz_pae_tb_ipsae",
-        "af2_ipsae_min",
-        "af2_pae_iptm",
         "plddt_binder_mean",
         "plddt_binder_min",
         "binder_ptm",
@@ -200,6 +212,8 @@ def run(args: argparse.Namespace) -> None:
     "bindcraft": "blue",
     "proteina_complexa": "teal",
     "rfaa": "firebrick",
+    "rfd3": "tv_orange",
+    "protein_hunter": "cyan",
 }
 
 _TOOL_DISPLAY_PYMOL = {
@@ -209,6 +223,8 @@ def run(args: argparse.Namespace) -> None:
     "bindcraft": "BindCraft",
     "proteina_complexa": "Proteina-Complexa",
     "rfaa": "RFAA",
+    "rfd3": "RFD3",
+    "protein_hunter": "Protein-Hunter",
 }
 
 
@@ -327,18 +343,23 @@ def add_parser(subparsers) -> None:
         description=__doc__,
     )
     p.add_argument("--boltz2-results", metavar="CSV", help="Output from 'refold-boltz2' (boltz2_results.csv)")
-    p.add_argument("--af2-results", metavar="CSV", help="Output from 'refold-af2' (af2_results.csv)")
     p.add_argument(
-        "--sequences", metavar="FASTA", help="FASTA from 'extract' step (for binder_id and source_tool tags)"
+        "--protenix-results",
+        metavar="CSV",
+        help="Optional: output from 'refold-protenix' (protenix_results.csv). Adds a "
+        "second engine to the agreement_count.",
     )
     p.add_argument(
-        "--native-metrics", metavar="CSV", help="BindCraft final_design_stats.csv to attach dG/dSASA/ShapeComp"
+        "--af3-results",
+        metavar="CSV",
+        help="Optional: output from 'refold-af3' (af3_results.csv; aarch64 / DGX "
+        "Spark only). Adds a third engine to the agreement_count.",
     )
     p.add_argument(
-        "--af2-pae-dir",
-        metavar="DIR",
-        help="Deprecated — PAE file paths are now recorded in af2_results.csv "
-        "automatically by refold_Version6. This flag is ignored.",
+        "--sequences", metavar="FASTA", help="FASTA from 'extract' step (for binder_id and source_tool tags)"
+    )
+    p.add_argument(
+        "--native-metrics", metavar="CSV", help="BindCraft final_design_stats.csv to attach dG/dSASA/ShapeComp"
     )
     p.add_argument("--output", "-o", required=True, metavar="DIR", help="Output directory for all report files")
     p.add_argument(
diff --git a/Evaluator/binder_comparison/cli/run.py b/Evaluator/binder_comparison/cli/run.py
index 7d6e320..620a5dc 100644
--- a/Evaluator/binder_comparison/cli/run.py
+++ b/Evaluator/binder_comparison/cli/run.py
@@ -1,7 +1,8 @@
 """CLI subcommand: binder-compare run
 
-Full orchestrator: runs all four steps (extract → refold-boltz2 →
-refold-af2 → report) by spawning subprocesses in the correct environments.
+Full orchestrator: extract → refold-boltz2 → refold-protenix (optional) → report.
+AF3 refolding (aarch64 / DGX Spark only, Part K) is wired separately via the
+``--af3-results`` flag on the ``report`` subcommand.
 
 Usage:
     binder-compare run \\
@@ -9,16 +10,16 @@
         --boltzgen   ./boltzgen_results \\
         --mosaic     ./mosaic_results \\
         --target-seq "MKTAYIAKQRQ..." \\
-        --target-pdb target.pdb \\
-        --output     ./comparison_report
+        --output     ./comparison_report \\
+        --protenix-env bindmaster_pxdesign  # omit or pass "" to skip Protenix
 
 Environment requirements:
-    Boltz2 refolding: uv venv at ~/BindMaster/mosaic/.venv  (preferred)
-                      OR conda env 'mosaic' if populated
-    AF2 refolding:    conda env 'bindcraft_pr'
-    Other steps:      any env with binder_comparison installed
+    Boltz-2 refolding:  uv venv at ~/BindMaster/Mosaic/.venv (preferred)
+                        OR conda env 'mosaic' if populated
+    Protenix refolding: conda env 'bindmaster_pxdesign' (shipped by PXDesign installer)
+    Other steps:        any env with binder_comparison installed
 
-Boltz2 environment selection (in order of precedence):
+Boltz-2 environment selection (in order of precedence):
     --boltz2-python /path/to/.venv/bin/python   (direct Python; skips conda)
     --boltz2-env    mosaic                       (conda run -n <env>)
 """
@@ -37,13 +38,13 @@ def run(args: argparse.Namespace) -> None:
 
     sequences_fasta = output_dir / "sequences.fasta"
     boltz2_csv = output_dir / "boltz2_results.csv"
-    af2_csv = output_dir / "af2_results.csv"
+    protenix_csv = output_dir / "protenix_results.csv"
 
     # ------------------------------------------------------------------
     # Step 1: Extract sequences
     # ------------------------------------------------------------------
     print("\n" + "=" * 60)
-    print("STEP 1/4 — Extracting sequences")
+    print("STEP 1/3 — Extracting sequences")
     print("=" * 60)
 
     extract_cmd = [sys.executable, "-m", "binder_comparison", "extract"]
@@ -64,11 +65,11 @@ def run(args: argparse.Namespace) -> None:
     _run_step(extract_cmd, "extract")
 
     # ------------------------------------------------------------------
-    # Step 2: Refold with Boltz2  (mosaic uv venv or conda env)
+    # Step 2: Refold with Boltz-2  (Mosaic uv venv or conda env)
     # ------------------------------------------------------------------
     boltz2_label = f"[python: {args.boltz2_python}]" if args.boltz2_python else f"[conda: {args.boltz2_env}]"
     print("\n" + "=" * 60)
-    print(f"STEP 2/4 — Boltz2 refolding  {boltz2_label}")
+    print(f"STEP 2/3 — Boltz-2 refolding  {boltz2_label}")
     print("=" * 60)
 
     _boltz2_inner = [
@@ -85,7 +86,6 @@ def run(args: argparse.Namespace) -> None:
     ] + (["--mosaic-path", args.mosaic_path] if args.mosaic_path else [])
 
     if args.boltz2_python:
-        # Direct Python path — used for uv venv (e.g. ~/BindMaster/mosaic/.venv/bin/python)
         boltz2_cmd = [args.boltz2_python, "-m", "binder_comparison"] + _boltz2_inner[1:]
     else:
         boltz2_cmd = _conda_cmd(args.boltz2_env, ["python", "-m", "binder_comparison"] + _boltz2_inner[1:])
@@ -93,41 +93,44 @@ def run(args: argparse.Namespace) -> None:
     _run_step(boltz2_cmd, "refold-boltz2")
 
     # ------------------------------------------------------------------
-    # Step 3: Refold with AF2  (bindcraft_pr env)
+    # Step 3: Refold with Protenix (optional — requires bindmaster_pxdesign env)
     # ------------------------------------------------------------------
-    print("\n" + "=" * 60)
-    print("STEP 3/4 — AF2 refolding  [conda: bindcraft_pr]")
-    print("=" * 60)
-
-    af2_cmd = _conda_cmd(
-        args.af2_env,
-        [
-            "python",
-            "-m",
-            "binder_comparison",
-            "refold-af2",
-            "--sequences",
-            str(sequences_fasta),
-            "--target-pdb",
-            args.target_pdb,
-            "--output",
-            str(af2_csv),
-            "--output-dir",
-            str(output_dir / "refold_af2"),
-            "--models",
-            args.af2_models,
-            "--num-recycles",
-            str(args.num_recycles),
-        ]
-        + (["--mosaic-path", args.mosaic_path] if args.mosaic_path else []),
-    )
-    _run_step(af2_cmd, "refold-af2")
+    run_protenix = bool(args.protenix_env)
+    if run_protenix:
+        print("\n" + "=" * 60)
+        print(f"STEP 3/4 — Protenix refolding  [conda: {args.protenix_env}]")
+        print("=" * 60)
+        protenix_cmd = _conda_cmd(
+            args.protenix_env,
+            [
+                "python",
+                "-m",
+                "binder_comparison",
+                "refold-protenix",
+                "--sequences",
+                str(sequences_fasta),
+                "--target-seq",
+                args.target_seq,
+                "--output",
+                str(protenix_csv),
+                "--output-dir",
+                str(output_dir / "refold_protenix"),
+                "--num-samples",
+                str(args.protenix_num_samples),
+                "--num-seeds",
+                str(args.protenix_num_seeds),
+            ],
+        )
+        _run_step(protenix_cmd, "refold-protenix")
+    else:
+        print("\n[run] Protenix refolding skipped (pass --protenix-env to enable).")
 
     # ------------------------------------------------------------------
     # Step 4: Report
     # ------------------------------------------------------------------
+    n_steps = 4 if run_protenix else 3
     print("\n" + "=" * 60)
-    print("STEP 4/4 — Generating comparison report")
+    print(f"STEP {n_steps}/{n_steps} — Generating comparison report")
     print("=" * 60)
 
     report_cmd = [
@@ -137,13 +140,13 @@ def run(args: argparse.Namespace) -> None:
         "report",
         "--boltz2-results",
         str(boltz2_csv),
-        "--af2-results",
-        str(af2_csv),
         "--sequences",
         str(sequences_fasta),
         "--output",
         str(output_dir / "report"),
     ]
+    if run_protenix and protenix_csv.exists():
+        report_cmd += ["--protenix-results", str(protenix_csv)]
     if args.bindcraft:
         bindcraft_dir = Path(args.bindcraft)
         final_csv = next(bindcraft_dir.glob("final_design_stats.csv"), None)
@@ -173,7 +176,7 @@ def _run_step(cmd: list[str], name: str) -> None:
 def add_parser(subparsers) -> None:
     p = subparsers.add_parser(
         "run",
-        help="Full pipeline: extract → refold-boltz2 → refold-af2 → report.",
+        help="Full pipeline: extract → refold-boltz2 → report.",
         formatter_class=argparse.RawDescriptionHelpFormatter,
         description=__doc__,
     )
@@ -189,19 +192,18 @@ def add_parser(subparsers) -> None:
         help="Proteina-Complexa output directory (containing sequences.csv)",
     )
     # Refolding targets
-    p.add_argument("--target-seq", required=True, metavar="SEQ", help="Target protein sequence (for Boltz2 refolding)")
-    p.add_argument("--target-pdb", required=True, metavar="PDB", help="Target PDB path (for AF2 refolding)")
+    p.add_argument("--target-seq", required=True, metavar="SEQ", help="Target protein sequence (for Boltz-2 refolding)")
     # Output
     p.add_argument("--output", "-o", required=True, metavar="DIR", help="Output directory for all results")
-    # Environment selection for Boltz2 refolding
+    # Environment selection for Boltz-2 refolding
     boltz2_grp = p.add_mutually_exclusive_group()
     boltz2_grp.add_argument(
         "--boltz2-python",
         default=None,
         metavar="PYTHON",
         help=(
-            "Direct Python executable for Boltz2 refolding — use this for the "
-            "uv venv (e.g. ~/BindMaster/mosaic/.venv/bin/python). "
+            "Direct Python executable for Boltz-2 refolding — use this for the "
+            "uv venv (e.g. ~/BindMaster/Mosaic/.venv/bin/python). "
             "Takes precedence over --boltz2-env."
         ),
     )
@@ -209,15 +211,7 @@ def add_parser(subparsers) -> None:
         "--boltz2-env",
         default="mosaic",
         metavar="ENV",
-        help="Conda env for Boltz2 refolding (default: mosaic)",
-    )
-    p.add_argument(
-        "--af2-env", default="bindcraft_pr", metavar="ENV", help="Conda env for AF2 refolding (default: bindcraft_pr)"
-    )
-    # Refolding options
-    p.add_argument("--af2-models", default="1", metavar="N[,N]", help="AF2 model indices (default: 1)")
-    p.add_argument(
-        "--num-recycles", type=int, default=3, metavar="N", help="Recycling iterations for both engines (default: 3)"
+        help="Conda env for Boltz-2 refolding (default: mosaic)",
     )
     p.add_argument("--mosaic-path", default=None, metavar="DIR", help="Mosaic repo path (auto-detected if not set)")
     p.add_argument(
@@ -225,4 +219,18 @@ def add_parser(subparsers) -> None:
         action="store_true",
         help="Include all Mosaic designs (default: only is_top=1 refolded designs)",
     )
+    # Protenix refolding (optional)
+    p.add_argument(
+        "--protenix-env",
+        default="",
+        metavar="ENV",
+        help=(
+            "Conda env for Protenix refolding (typically 'bindmaster_pxdesign'). "
+            "Omit or pass empty string to skip Protenix."
+        ),
+    )
+    p.add_argument(
+        "--protenix-num-samples", type=int, default=5, metavar="N", help="Protenix samples per seed (default: 5)"
+    )
+    p.add_argument("--protenix-num-seeds", type=int, default=1, metavar="N", help="Protenix random seeds (default: 1)")
     p.set_defaults(func=run)
diff --git a/Evaluator/binder_comparison/comparison/__init__.py b/Evaluator/binder_comparison/comparison/__init__.py
index e174ce4..7f6427a 100644
--- a/Evaluator/binder_comparison/comparison/__init__.py
+++ b/Evaluator/binder_comparison/comparison/__init__.py
@@ -1,7 +1,8 @@
 from .ensemble import compute_ensemble_metrics
 from .merger import merge_refold_results
 from .scoring import (
-    add_af2_ipsae_from_files,
+    add_boltz_ipsae_from_files,
+    add_ipsae_from_pae_files,
     add_iptm_from_pae_files,
     apply_screening_thresholds,
     compute_agreement,
@@ -13,7 +14,8 @@
 from .statistics import compute_statistics
 
 __all__ = [
-    "add_af2_ipsae_from_files",
+    "add_boltz_ipsae_from_files",
+    "add_ipsae_from_pae_files",
     "add_iptm_from_pae_files",
     "apply_screening_thresholds",
     "compute_agreement",
diff --git a/Evaluator/binder_comparison/comparison/ensemble.py b/Evaluator/binder_comparison/comparison/ensemble.py
index 3d3a449..2cd5b50 100644
--- a/Evaluator/binder_comparison/comparison/ensemble.py
+++ b/Evaluator/binder_comparison/comparison/ensemble.py
@@ -2,13 +2,11 @@
 
 For each of the 8 standardised metrics, the canonical column (e.g. ``iptm``,
 ``ipae``) is a direct copy of the corresponding ``boltz_*`` source column.
-AF2 columns (``af2_iptm``, ``af2_ipae``, etc.) remain in the DataFrame for
-cross-validation but are not used for ranking or scoring.
 
 Boltz2-exclusive metrics (IPSAE family, binder_ptm, pTMEnergy, etc.) are
 passed through unchanged under their boltz_* column names.
 
-Note: despite the module name, no ensemble averaging takes place.  This module
+Note: despite the module name, no ensemble averaging takes place. This module
 exists for historical reasons and may be renamed in a future refactor.
 """
 
diff --git a/Evaluator/binder_comparison/comparison/merger.py b/Evaluator/binder_comparison/comparison/merger.py
index cf45446..20443ab 100644
--- a/Evaluator/binder_comparison/comparison/merger.py
+++ b/Evaluator/binder_comparison/comparison/merger.py
@@ -1,9 +1,10 @@
-"""Merge Boltz2 and AF2 refolding results into a single DataFrame.
+"""Merge refolding results from multiple engines into a single DataFrame.
 
 Column naming convention after merge:
-  - Boltz2 columns are prefixed with 'boltz_' (e.g. boltz_iptm, boltz_ipae)
-  - AF2 columns already carry the 'af2_' prefix from refold_Version6
-  - 'sequence' is the join key, present in both CSVs
+  - Boltz-2 columns are prefixed with ``boltz_``
+  - Protenix columns are prefixed with ``protenix_``
+  - AF3 columns (Part K, aarch64) are prefixed with ``af3_``
+  - ``sequence`` is the join key, present in every engine's CSV
 
 The FASTA extracted by 'binder-compare extract' is used to join the source
 tool tag back onto the merged results.
@@ -18,80 +19,82 @@
 
 from ..io.read import read_csv_safe, read_fasta
 
-# Columns from Boltz2 CSV (refold_Version5) that should NOT be prefixed
-_BOLTZ2_PASSTHROUGH_COLS = {"sequence", "target_sequence", "binder_length", "run_id"}
-
-# Identity columns from AF2 CSV (refold_Version6) to drop after merge (redundant)
-_AF2_DROP_COLS = {"run_id", "idx", "target_pdb", "binder_length"}
+# Columns that should NOT be prefixed — passthrough identifiers shared across engines
+_PASSTHROUGH_COLS = {"sequence", "target_sequence", "binder_length", "run_id"}
 
 
 def merge_refold_results(
     boltz2_csv: str | Path | None,
-    af2_csv: str | Path | None,
     sequences_fasta: str | Path | None = None,
+    *,
+    protenix_csv: str | Path | None = None,
+    af3_csv: str | Path | None = None,
 ) -> pd.DataFrame:
-    """Join Boltz2 and AF2 results on the 'sequence' column.
+    """Outer-join refolding-engine CSVs on the ``sequence`` column.
 
     Args:
-        boltz2_csv:       Path to boltz2_results.csv from 'refold-boltz2'.
-        af2_csv:          Path to af2_results.csv from 'refold-af2'.
-        sequences_fasta:  Optional FASTA from 'extract' step; used to attach
-                          binder_id and source_tool columns.
+        boltz2_csv:       Path to boltz2_results.csv from 'refold-boltz2' (required
+                          anchor — report needs at least one engine).
+        sequences_fasta:  Optional FASTA from 'extract'; attaches binder_id and
+                          source_tool columns.
+        protenix_csv:     Optional Protenix results (refold-protenix output).
+        af3_csv:          Optional AF3 results (refold-af3 output, aarch64 only).
 
     Returns:
-        DataFrame with all metrics. Sequences present in only one model's CSV
-        get NaN for the missing model's columns (outer join).
+        DataFrame with per-engine prefixed columns + passthrough identifiers.
     """
-    boltz_df = _load_boltz2(boltz2_csv) if boltz2_csv else pd.DataFrame()
-    af2_df = _load_af2(af2_csv) if af2_csv else pd.DataFrame()
-
-    if boltz_df.empty and af2_df.empty:
-        raise ValueError("Both boltz2_csv and af2_csv are absent or empty.")
-
-    if boltz_df.empty:
-        warnings.warn("No Boltz2 results — AF2 metrics only.")
-        merged = af2_df.copy()
-    elif af2_df.empty:
-        warnings.warn("No AF2 results — Boltz2 metrics only.")
-        merged = boltz_df.copy()
-    else:
-        merged = pd.merge(boltz_df, af2_df, on="sequence", how="outer")
-        n_both = merged[["boltz_iptm", "af2_iptm"]].notna().all(axis=1).sum()
-        n_total = len(merged)
-        print(
-            f"[merger] {n_total} unique sequences: {n_both} have both models, "
-            f"{(merged['boltz_iptm'].isna()).sum()} Boltz2-only, "
-            f"{(merged['af2_iptm'].isna()).sum()} AF2-only"
-        )
+    engine_dfs: dict[str, pd.DataFrame] = {}
+    if boltz2_csv:
+        engine_dfs["boltz"] = _load_engine(boltz2_csv, "boltz")
+    if protenix_csv:
+        engine_dfs["protenix"] = _load_engine(protenix_csv, "protenix")
+    if af3_csv:
+        engine_dfs["af3"] = _load_engine(af3_csv, "af3")
+
+    engine_dfs = {k: v for k, v in engine_dfs.items() if not v.empty}
+    if not engine_dfs:
+        raise ValueError("No non-empty refolding CSVs supplied — nothing to report on.")
+
+    merged: pd.DataFrame | None = None
+    for name, df in engine_dfs.items():
+        if merged is None:
+            merged = df.copy()
+        else:
+            merged = pd.merge(merged, df, on="sequence", how="outer", suffixes=("", f"_{name}_dup"))
+            # Drop accidental duplicate passthrough columns (prefer the first engine's values)
+            for col in list(merged.columns):
+                if col.endswith(f"_{name}_dup"):
+                    merged.drop(columns=[col], inplace=True)
+
+    assert merged is not None  # engine_dfs non-empty guarantee
+
+    if len(engine_dfs) > 1:
+        iptm_cols = {name: f"{name}_iptm" for name in engine_dfs}
+        present = [c for c in iptm_cols.values() if c in merged.columns]
+        if present:
+            n_total = len(merged)
+            both_mask = merged[present].notna().all(axis=1)
+            print(
+                f"[merger] {n_total} unique sequences — "
+                f"{int(both_mask.sum())} have all {len(present)} engines, "
+                + ", ".join(f"{int(merged[c].isna().sum())} missing {c.split('_')[0]}" for c in present)
+            )
 
-    # Attach binder_id and source_tool from the FASTA if provided
     if sequences_fasta:
         merged = _attach_fasta_metadata(merged, sequences_fasta)
-
     return merged
 
 
-def _load_boltz2(path: str | Path) -> pd.DataFrame:
-    """Load and prefix Boltz2 CSV columns with 'boltz_'."""
+def _load_engine(path: str | Path, prefix: str) -> pd.DataFrame:
+    """Load a refolding-engine CSV and prefix non-passthrough columns with ``{prefix}_``."""
     df = read_csv_safe(path)
     if df.empty:
+        warnings.warn(f"[merger] {prefix} CSV is empty: {path}")
         return df
-
-    rename = {col: f"boltz_{col}" for col in df.columns if col not in _BOLTZ2_PASSTHROUGH_COLS}
+    rename = {col: f"{prefix}_{col}" for col in df.columns if col not in _PASSTHROUGH_COLS}
     return df.rename(columns=rename)
 
 
-def _load_af2(path: str | Path) -> pd.DataFrame:
-    """Load AF2 CSV; columns already have 'af2_' prefix in refold_Version6."""
-    df = read_csv_safe(path)
-    if df.empty:
-        return df
-
-    # Drop identity columns that would create conflicts in the merge
-    drop = [c for c in _AF2_DROP_COLS if c in df.columns]
-    return df.drop(columns=drop)
-
-
 def _attach_fasta_metadata(df: pd.DataFrame, fasta_path: str | Path) -> pd.DataFrame:
     """Add binder_id and source_tool columns from the extract FASTA."""
     entries = read_fasta(fasta_path)
@@ -103,7 +106,6 @@ def _attach_fasta_metadata(df: pd.DataFrame, fasta_path: str | Path) -> pd.DataF
             if "=" in token:
                 k, v = token.split("=", 1)
                 parts[k] = v
-        # First token before whitespace/tags is the binder_id
         binder_id = tokens[0] if tokens else header
         meta_rows.append(
             {
@@ -117,6 +119,4 @@ def _attach_fasta_metadata(df: pd.DataFrame, fasta_path: str | Path) -> pd.DataF
         return df
 
     meta_df = pd.DataFrame(meta_rows)
-    # Left-join so we keep all sequences even if the FASTA is stale
-    merged = pd.merge(df, meta_df, on="sequence", how="left")
-    return merged
+    return pd.merge(df, meta_df, on="sequence", how="left")
diff --git a/Evaluator/binder_comparison/comparison/scoring.py b/Evaluator/binder_comparison/comparison/scoring.py
index cf4a501..df55319 100644
--- a/Evaluator/binder_comparison/comparison/scoring.py
+++ b/Evaluator/binder_comparison/comparison/scoring.py
@@ -32,8 +32,9 @@
 
 # PAE cutoff for Dunbrack ipSAE formula (Å).
 # Uniform 10 Å for all engines so that ipSAE scores are directly comparable
-# across Boltz-2 and AF2.  Overath et al. (2025) thresholds (0.61, 0.80) were
-# calibrated with a 10 Å cutoff on AF3 data.
+# across Boltz-2 and other refolders (Protenix on x86, AF3 on aarch64).
+# Overath et al. (2025) thresholds (0.61, 0.80) were calibrated with a 10 Å
+# cutoff on AF3 data.
 IPSAE_PAE_CUTOFF = 10.0
 
 
@@ -93,10 +94,10 @@ def compute_ipsae_from_pae(
         pae:           PAE matrix in Ångströms, shape [L_total, L_total].
         binder_length: Number of binder residues (L_b).
         pae_cutoff:    PAE filter threshold in Å (default: 10 Å, uniform
-                       for both Boltz-2 and AF2).
-        ordering:      'binder_target' (Boltz2 native, [binder|target]) or
-                       'target_binder' (AF2 native, [target|binder]).
-                       AF2 arrays are transposed internally to [binder|target].
+                       across engines).
+        ordering:      'binder_target' (Boltz-2 native, [binder|target]) or
+                       'target_binder' (arrays transposed internally to
+                       [binder|target]).
 
     Returns:
         dict with keys: bt_ipsae, tb_ipsae, ipsae_min, ipsae_max, ipsae_valid
@@ -113,7 +114,7 @@ def compute_ipsae_from_pae(
 
     # Normalise to [binder | target] ordering
     if ordering == "target_binder":
-        # AF2 native: [target | binder] → swap to [binder | target]
+        # [target | binder] input → swap to [binder | target]
         pae = np.block(
             [
                 [pae[L_t:, L_t:], pae[L_t:, :L_t]],  # binder-binder, binder-target
@@ -171,26 +172,27 @@ def _resolve_pae_path(
 
 
 # ---------------------------------------------------------------------------
-# AF2 ipSAE from saved PAE files
+# Boltz-2 ipSAE from saved PAE files
 # ---------------------------------------------------------------------------
 
 
-def add_af2_ipsae_from_files(
+def add_boltz_ipsae_from_files(
     df: pd.DataFrame,
-    pae_file_col: str = "af2_pae_file",
+    pae_file_col: str = "boltz_pae_file",
     binder_length_col: str = "binder_length",
     pae_cutoff: float = IPSAE_PAE_CUTOFF,
     base_dir: str | Path | None = None,
 ) -> pd.DataFrame:
-    """Load AF2 PAE .npy files and compute ipSAE scores, adding them to df.
+    """Load Boltz-2 PAE .npy files and compute DunbrackLab ipSAE scores.
 
-    Adds columns: af2_bt_ipsae, af2_tb_ipsae, af2_ipsae_min, af2_ipsae_max.
+    Adds columns: boltz_pae_bt_ipsae, boltz_pae_tb_ipsae, boltz_pae_ipsae_min,
+                  boltz_pae_ipsae_max.
 
-    AF2 PAE arrays are stored in [target | binder] ordering by refold_Version6,
-    so 'target_binder' ordering is used.
+    Boltz-2 PAE arrays are in [binder | target] ordering (native), so
+    'binder_target' ordering is used.
 
     Args:
-        df:              DataFrame with AF2 refolding results.
+        df:              DataFrame with Boltz-2 refolding results.
         pae_file_col:    Column containing paths to PAE .npy files.
         binder_length_col: Column with binder sequence length.
         pae_cutoff:      PAE cutoff in Å (default 10 Å, uniform across engines).
@@ -217,7 +219,7 @@ def add_af2_ipsae_from_files(
 
         try:
             pae = np.load(str(resolved))
-            scores = compute_ipsae_from_pae(pae, int(L_b), pae_cutoff, ordering="target_binder")
+            scores = compute_ipsae_from_pae(pae, int(L_b), pae_cutoff, ordering="binder_target")
             bt_ipsae_vals.append(scores["bt_ipsae"])
             tb_ipsae_vals.append(scores["tb_ipsae"])
             min_vals.append(scores["ipsae_min"])
@@ -225,46 +227,52 @@ def add_af2_ipsae_from_files(
         except Exception as e:
             import warnings
 
-            warnings.warn(f"Failed to compute AF2 ipSAE for {pae_path}: {e}")
+            warnings.warn(f"Failed to compute Boltz-2 ipSAE for {pae_path}: {e}")
             bt_ipsae_vals.append(np.nan)
             tb_ipsae_vals.append(np.nan)
             min_vals.append(np.nan)
             max_vals.append(np.nan)
 
-    result["af2_bt_ipsae"] = bt_ipsae_vals
-    result["af2_tb_ipsae"] = tb_ipsae_vals
-    result["af2_ipsae_min"] = min_vals
-    result["af2_ipsae_max"] = max_vals
+    result["boltz_pae_bt_ipsae"] = bt_ipsae_vals
+    result["boltz_pae_tb_ipsae"] = tb_ipsae_vals
+    result["boltz_pae_ipsae_min"] = min_vals
+    result["boltz_pae_ipsae_max"] = max_vals
 
     return result
 
 
 # ---------------------------------------------------------------------------
-# Boltz-2 ipSAE from saved PAE files
+# Generic PAE → DunbrackLab ipSAE loader (for Protenix, AF3, and future engines)
 # ---------------------------------------------------------------------------
 
 
-def add_boltz_ipsae_from_files(
+def add_ipsae_from_pae_files(
     df: pd.DataFrame,
-    pae_file_col: str = "boltz_pae_file",
+    pae_file_col: str,
     binder_length_col: str = "binder_length",
     pae_cutoff: float = IPSAE_PAE_CUTOFF,
     base_dir: str | Path | None = None,
+    *,
+    prefix: str,
+    ordering: str = "target_binder",
 ) -> pd.DataFrame:
-    """Load Boltz-2 PAE .npy files and compute DunbrackLab ipSAE scores.
+    """Load saved PAE .npy files and add DunbrackLab ipSAE columns to *df*.
 
-    Adds columns: boltz_pae_bt_ipsae, boltz_pae_tb_ipsae, boltz_pae_ipsae_min,
-                  boltz_pae_ipsae_max.
-
-    Boltz-2 PAE arrays are in [binder | target] ordering (native), so
-    'binder_target' ordering is used.
+    Engine-agnostic version of ``add_boltz_ipsae_from_files``. Adds columns
+    ``{prefix}_bt_ipsae``, ``{prefix}_tb_ipsae``, ``{prefix}_ipsae_min``,
+    ``{prefix}_ipsae_max`` where ``prefix`` is e.g. "protenix" or "af3".
 
     Args:
-        df:              DataFrame with Boltz-2 refolding results.
+        df:              DataFrame with engine refolding results.
         pae_file_col:    Column containing paths to PAE .npy files.
         binder_length_col: Column with binder sequence length.
         pae_cutoff:      PAE cutoff in Å (default 10 Å, uniform across engines).
         base_dir:        Base directory for resolving relative PAE file paths.
+        prefix:          Output column prefix (e.g. 'protenix', 'af3').
+        ordering:        'binder_target' or 'target_binder' — how the PAE matrix
+                         is laid out. Protenix and AF3 both default to
+                         'target_binder' because we always put target first in
+                         the input JSON.
     """
     result = df.copy()
 
@@ -287,7 +295,7 @@ def add_boltz_ipsae_from_files(
 
         try:
             pae = np.load(str(resolved))
-            scores = compute_ipsae_from_pae(pae, int(L_b), pae_cutoff, ordering="binder_target")
+            scores = compute_ipsae_from_pae(pae, int(L_b), pae_cutoff, ordering=ordering)
             bt_ipsae_vals.append(scores["bt_ipsae"])
             tb_ipsae_vals.append(scores["tb_ipsae"])
             min_vals.append(scores["ipsae_min"])
@@ -295,16 +303,16 @@ def add_boltz_ipsae_from_files(
         except Exception as e:
             import warnings
 
-            warnings.warn(f"Failed to compute Boltz-2 ipSAE for {pae_path}: {e}")
+            warnings.warn(f"Failed to compute {prefix} ipSAE for {pae_path}: {e}")
             bt_ipsae_vals.append(np.nan)
             tb_ipsae_vals.append(np.nan)
             min_vals.append(np.nan)
             max_vals.append(np.nan)
 
-    result["boltz_pae_bt_ipsae"] = bt_ipsae_vals
-    result["boltz_pae_tb_ipsae"] = tb_ipsae_vals
-    result["boltz_pae_ipsae_min"] = min_vals
-    result["boltz_pae_ipsae_max"] = max_vals
+    result[f"{prefix}_bt_ipsae"] = bt_ipsae_vals
+    result[f"{prefix}_tb_ipsae"] = tb_ipsae_vals
+    result[f"{prefix}_ipsae_min"] = min_vals
+    result[f"{prefix}_ipsae_max"] = max_vals
 
     return result
 
@@ -331,7 +339,7 @@ def add_iptm_from_pae_files(
         pae_file_col:    Column containing paths to PAE .npy files.
         binder_length_col: Column with binder sequence length.
         ordering:        PAE matrix ordering ('binder_target' or 'target_binder').
-        prefix:          Column name prefix ('boltz' or 'af2').
+        prefix:          Column name prefix ('boltz', 'protenix', or 'af3').
         base_dir:        Base directory for resolving relative PAE file paths.
     """
     result = df.copy()
@@ -545,14 +553,18 @@ def compute_iptm_from_pae(
 def compute_agreement(df: pd.DataFrame, threshold: float = IPSAE_PASS_THRESHOLD) -> pd.DataFrame:
     """Count how many independent refolding engines agree a design passes.
 
-    Checks each of boltz_pae_ipsae_min, af2_ipsae_min (and ipsae_min as
-    promoted column) against *threshold*.  Designs with higher agreement
-    are stronger candidates.
+    Checks each available engine column against *threshold*. Designs with
+    higher agreement are stronger candidates.
+
+    Supported engine columns (present when the corresponding refolder has run):
+      - boltz_pae_ipsae_min  (Boltz-2, always when Evaluator runs)
+      - protenix_ipsae_min   (Protenix, x86 + aarch64 when bindmaster_pxdesign env present)
+      - af3_ipsae_min        (AlphaFold 3, aarch64 / DGX Spark when weights configured)
 
     Adds column 'agreement_count' (0–N, where N = number of available engines).
     """
     result = df.copy()
-    engine_cols = ["boltz_pae_ipsae_min", "af2_ipsae_min"]
+    engine_cols = ["boltz_pae_ipsae_min", "protenix_ipsae_min", "af3_ipsae_min"]
     count = pd.Series(0, index=df.index)
     for col in engine_cols:
         if col in result.columns:
@@ -610,14 +622,14 @@ def rank_by_adaptyv_method(df: pd.DataFrame) -> pd.DataFrame:
         ascending.append(False)
 
     # Tertiary: iptm
-    for col in ["iptm", "boltz_iptm", "af2_iptm"]:
+    for col in ["iptm", "boltz_iptm"]:
         if col in result.columns:
             sort_keys.append(col)
             ascending.append(False)
             break
 
     # Quaternary: pLDDT
-    for col in ["plddt_binder_mean", "boltz_plddt_binder_mean", "af2_plddt_binder_mean"]:
+    for col in ["plddt_binder_mean", "boltz_plddt_binder_mean"]:
         if col in result.columns:
             sort_keys.append(col)
             ascending.append(False)
@@ -650,7 +662,6 @@ def _best_ipsae_col(df: pd.DataFrame) -> str | None:
     for col in [
         "ipsae_min",  # promoted DunbrackLab PAE-based (10 Å cutoff)
         "boltz_pae_ipsae_min",  # Boltz-2 PAE-based (DunbrackLab, 10 Å cutoff)
-        "af2_ipsae_min",  # AF2 PAE-based (DunbrackLab, 10 Å cutoff)
         "ipsae_min_aux",  # Mosaic aux (max aggregation, renamed)
         "boltz_ipsae_min",  # Mosaic aux (pre-rename)
     ]:
diff --git a/Evaluator/binder_comparison/core/schema.py b/Evaluator/binder_comparison/core/schema.py
index 06b3c23..d29bdc7 100644
--- a/Evaluator/binder_comparison/core/schema.py
+++ b/Evaluator/binder_comparison/core/schema.py
@@ -7,7 +7,17 @@
 
 import numpy as np
 
-SourceTool = Literal["bindcraft", "boltzgen", "mosaic", "pxdesign", "rfaa", "unknown"]
+SourceTool = Literal[
+    "bindcraft",
+    "boltzgen",
+    "mosaic",
+    "pxdesign",
+    "rfaa",
+    "rfd3",
+    "proteina_complexa",
+    "protein_hunter",
+    "unknown",
+]
 
 
 @dataclass
@@ -29,17 +39,16 @@ class NativeMetrics:
 
 @dataclass
 class StandardisedMetrics:
-    """Metrics produced by running both refolding engines on every binder.
+    """Metrics produced by refolding every extracted binder with Boltz-2.
 
     Every binder gets these, regardless of which tool designed it.
 
     Canonical columns (iptm, ipae, etc.) are direct copies of Boltz-2 values.
-    AF2 columns remain available for cross-validation.
-    Boltz2-exclusive metrics (IPSAE family) have no AF2 equivalent.
+    Boltz2-exclusive metrics (IPSAE family) are pass-through.
 
     Scale notes:
-    - pLDDT: expected [0, 1] for both engines
-    - PAE: expected in Ångströms for both engines
+    - pLDDT: expected [0, 1]
+    - PAE: expected in Ångströms
     - IPSAE: Boltz2-specific score, higher = better interface contact
     """
 
@@ -53,16 +62,6 @@ class StandardisedMetrics:
     plddt_binder_min: float | None = None
     plddt_target_mean: float | None = None
 
-    # ---- AF2-specific values (pre-ensemble) ----
-    af2_iptm: float | None = None
-    af2_ipae: float | None = None
-    af2_pae_bt: float | None = None
-    af2_pae_tb: float | None = None
-    af2_pae_bb: float | None = None
-    af2_plddt_binder_mean: float | None = None
-    af2_plddt_binder_min: float | None = None
-    af2_plddt_target_mean: float | None = None
-
     # ---- Boltz2-specific values (pre-ensemble) ----
     boltz_iptm: float | None = None
     boltz_ipae: float | None = None
@@ -73,7 +72,7 @@ class StandardisedMetrics:
     boltz_plddt_binder_min: float | None = None
     boltz_plddt_target_mean: float | None = None
 
-    # ---- Boltz2-exclusive (no AF2 equivalent) ----
+    # ---- Boltz2-exclusive ----
     bt_ipsae: float | None = None  # Binder→target IPSAE, 6-sample avg
     tb_ipsae: float | None = None  # Target→binder IPSAE
     ipsae_min: float | None = None  # min(bt, tb) — worst-case interface contact
@@ -84,24 +83,52 @@ class StandardisedMetrics:
     target_contact: float | None = None  # Binder-target contacts
     pTMEnergy: float | None = None  # Boltz2 energy proxy (lower better)
 
+    # ---- Protenix v0.5.0 values (universal 2nd engine; rides bindmaster_pxdesign env) ----
+    # pLDDT is rescaled 0-100 → 0-1 on ingest so it's directly comparable to Boltz-2.
+    protenix_iptm: float | None = None
+    protenix_ptm: float | None = None
+    protenix_ranking_score: float | None = None  # 0.8*iptm + 0.2*ptm + 0.5*disorder - 100*has_clash
+    protenix_plddt_binder_mean: float | None = None
+    protenix_plddt_binder_min: float | None = None
+    protenix_plddt_target_mean: float | None = None
+    protenix_pae_bt: float | None = None
+    protenix_pae_tb: float | None = None
+    protenix_pae_bb: float | None = None
+    # DunbrackLab PAE-based ipSAE (added by report.py post-merge)
+    protenix_bt_ipsae: float | None = None
+    protenix_tb_ipsae: float | None = None
+    protenix_ipsae_min: float | None = None
+
+    # ---- AlphaFold 3 v3.0.2 values (aarch64 / DGX Spark only; wired in Part K) ----
+    af3_iptm: float | None = None
+    af3_ptm: float | None = None
+    af3_ranking_score: float | None = None
+    af3_plddt_binder_mean: float | None = None
+    af3_plddt_binder_min: float | None = None
+    af3_plddt_target_mean: float | None = None
+    af3_pae_bt: float | None = None
+    af3_pae_tb: float | None = None
+    af3_pae_bb: float | None = None
+    af3_bt_ipsae: float | None = None
+    af3_tb_ipsae: float | None = None
+    af3_ipsae_min: float | None = None
+
 
 @dataclass
 class PerResidueData:
-    """Raw per-residue arrays from refolding engines.
+    """Raw per-residue arrays from the Boltz-2 refolding engine.
 
-    Both are normalised to [binder | target] ordering:
-    - Boltz2 native: [binder | target] — no change needed
-    - AF2 native: [target | binder] — must be swapped on load
+    Normalised to [binder | target] ordering (Boltz-2 native).
 
     pLDDT shape: [L_b + L_t]
     PAE shape:   [L_b + L_t, L_b + L_t]
     """
 
     binder_length: int | None = None
-    af2_plddt: np.ndarray | None = None
-    af2_pae: np.ndarray | None = None
     boltz_plddt: np.ndarray | None = None
     boltz_pae: np.ndarray | None = None
+    protenix_pae: np.ndarray | None = None
+    af3_pae: np.ndarray | None = None
 
 
 @dataclass
@@ -124,7 +151,7 @@ class MetricResult:
     standardised: StandardisedMetrics = field(default_factory=StandardisedMetrics)
     native: NativeMetrics = field(default_factory=NativeMetrics)
     per_residue: PerResidueData = field(default_factory=PerResidueData)
-    model_weights: dict[str, float] = field(default_factory=lambda: {"af2": 0.6, "boltz2": 0.4})
+    model_weights: dict[str, float] = field(default_factory=lambda: {"boltz2": 1.0})
 
     def to_flat_dict(self) -> dict:
         """Flatten all metrics into a single dict for CSV export."""
@@ -150,7 +177,7 @@ class ComparisonReport:
     summary_statistics: dict[str, dict[str, float]]  # metric → {mean, std, min, max}
     z_scores: dict[str, dict[str, float]]  # binder_id → metric → z_score
     rankings: dict[str, list[str]]  # metric → ordered binder_ids
-    model_weights: dict[str, float] = field(default_factory=lambda: {"af2": 0.6, "boltz2": 0.4})
+    model_weights: dict[str, float] = field(default_factory=lambda: {"boltz2": 1.0})
 
 
 # Metrics where lower is better (for correct ranking direction)
@@ -160,14 +187,16 @@ class ComparisonReport:
         "pae_bt",
         "pae_tb",
         "pae_bb",
-        "af2_ipae",
-        "af2_pae_bt",
-        "af2_pae_tb",
-        "af2_pae_bb",
         "boltz_ipae",
         "boltz_pae_bt",
         "boltz_pae_tb",
         "boltz_pae_bb",
+        "protenix_pae_bt",
+        "protenix_pae_tb",
+        "protenix_pae_bb",
+        "af3_pae_bt",
+        "af3_pae_tb",
+        "af3_pae_bb",
         "pTMEnergy",
     }
 )
@@ -217,11 +246,24 @@ class ComparisonReport:
     "boltz_pae_bt_ipsae",
     "boltz_pae_tb_ipsae",
     "boltz_pae_ipsae_min",
-    # AF2 PAE-based ipSAE (DunbrackLab formula, 10 Å cutoff)
-    "af2_bt_ipsae",
-    "af2_tb_ipsae",
-    "af2_ipsae_min",
+    # Protenix DunbrackLab ipSAE + summary metrics
+    "protenix_iptm",
+    "protenix_ptm",
+    "protenix_ranking_score",
+    "protenix_plddt_binder_mean",
+    "protenix_bt_ipsae",
+    "protenix_tb_ipsae",
+    "protenix_ipsae_min",
+    "protenix_pae_iptm",
+    # AF3 DunbrackLab ipSAE + summary metrics (aarch64 / DGX Spark only)
+    "af3_iptm",
+    "af3_ptm",
+    "af3_ranking_score",
+    "af3_plddt_binder_mean",
+    "af3_bt_ipsae",
+    "af3_tb_ipsae",
+    "af3_ipsae_min",
+    "af3_pae_iptm",
     # ipTM computed independently from PAE matrices
     "boltz_pae_iptm",
-    "af2_pae_iptm",
 ]
diff --git a/Evaluator/binder_comparison/extractors/__init__.py b/Evaluator/binder_comparison/extractors/__init__.py
index 43dffba..9209bb9 100644
--- a/Evaluator/binder_comparison/extractors/__init__.py
+++ b/Evaluator/binder_comparison/extractors/__init__.py
@@ -2,16 +2,20 @@
 from .bindcraft import BindCraftExtractor
 from .boltzgen import BoltzGenExtractor
 from .mosaic import MosaicExtractor
+from .protein_hunter import ProteinHunterExtractor
 from .proteina_complexa import ProteinaComplexaExtractor
 from .pxdesign import PXDesignExtractor
 from .rfaa import RFAAExtractor
+from .rfd3 import RFD3Extractor
 
 __all__ = [
     "BindCraftExtractor",
     "BoltzGenExtractor",
     "MosaicExtractor",
     "PXDesignExtractor",
+    "ProteinHunterExtractor",
     "ProteinaComplexaExtractor",
     "RFAAExtractor",
+    "RFD3Extractor",
     "SequenceExtractor",
 ]
diff --git a/Evaluator/binder_comparison/extractors/protein_hunter.py b/Evaluator/binder_comparison/extractors/protein_hunter.py
new file mode 100644
index 0000000..e605ef2
--- /dev/null
+++ b/Evaluator/binder_comparison/extractors/protein_hunter.py
@@ -0,0 +1,126 @@
+"""Protein-Hunter sequence extractor.
+
+Protein-Hunter (Cho et al. 2025, bioRxiv 10.1101/2025.10.10.681530) runs
+Boltz-2 / Chai-1 multi-cycle hallucination and writes per-job outputs into
+``results_boltz/<name>/`` or ``results_chai/<name>/``:
+
+  - summary_high_iptm.csv  — successes with iptm > threshold & %X filter
+  - summary_all_runs.csv   — every run, all cycles (best_* columns)
+  - high_iptm_pdb/*.pdb    — PDBs of passing designs
+  - high_iptm_yaml/*.yaml  — Boltz-formatted YAMLs for AF3 rerun
+
+Default: read ``summary_high_iptm.csv`` (already filtered by ipTM + %X — tracks
+the Mosaic ``is_top=1`` pattern). Pass ``all_runs=True`` to return every
+``best_seq`` from ``summary_all_runs.csv`` instead.
+
+Note: Protein-Hunter has no PyRosetta interface metrics; NativeMetrics is empty.
+      Cross-validation metrics come from the standardised refolding pipeline
+      (Boltz-2 / Protenix / AF3).
+"""
+
+from __future__ import annotations
+
+import warnings
+from pathlib import Path
+
+import pandas as pd
+
+from ..core.schema import ExtractedBinder, NativeMetrics
+from .base import SequenceExtractor
+
+_HIGH_IPTM_CSV = "summary_high_iptm.csv"
+_ALL_RUNS_CSV = "summary_all_runs.csv"
+
+
+class ProteinHunterExtractor(SequenceExtractor):
+    """Extract binder sequences from a Protein-Hunter results directory."""
+
+    def __init__(self, *, all_runs: bool = False):
+        self.all_runs = all_runs
+
+    @property
+    def tool_name(self) -> str:
+        return "protein_hunter"
+
+    def extract(self, input_dir: str | Path) -> list[ExtractedBinder]:
+        input_dir = Path(input_dir)
+        if self.all_runs:
+            return self._extract_all_runs(input_dir)
+        return self._extract_high_iptm(input_dir)
+
+    def _extract_high_iptm(self, input_dir: Path) -> list[ExtractedBinder]:
+        csv_path = self._find_csv(input_dir, _HIGH_IPTM_CSV)
+        if csv_path is None:
+            warnings.warn(
+                f"Protein-Hunter: no {_HIGH_IPTM_CSV} found in {input_dir}. "
+                f"Fall back to --all-protein-hunter-designs to read {_ALL_RUNS_CSV}."
+            )
+            return []
+
+        df = pd.read_csv(csv_path)
+        if "sequence" not in df.columns:
+            raise ValueError(
+                f"Protein-Hunter CSV {csv_path} missing 'sequence' column. Available: {list(df.columns[:10])}"
+            )
+
+        results: list[ExtractedBinder] = []
+        for idx, row in df.iterrows():
+            seq = str(row["sequence"]).strip().upper()
+            if not self._validate_sequence(seq):
+                warnings.warn(f"Protein-Hunter row {idx}: invalid sequence — skipping")
+                continue
+            results.append(
+                ExtractedBinder(
+                    binder_id=self._make_id(row, int(idx)),
+                    sequence=seq,
+                    source_tool="protein_hunter",
+                    native=NativeMetrics(),
+                )
+            )
+        return results
+
+    def _extract_all_runs(self, input_dir: Path) -> list[ExtractedBinder]:
+        """Read summary_all_runs.csv and return the best sequence per run."""
+        csv_path = self._find_csv(input_dir, _ALL_RUNS_CSV)
+        if csv_path is None:
+            warnings.warn(f"Protein-Hunter: no {_ALL_RUNS_CSV} found in {input_dir}.")
+            return []
+
+        df = pd.read_csv(csv_path)
+        seq_col = "best_seq" if "best_seq" in df.columns else "sequence"
+        if seq_col not in df.columns:
+            raise ValueError(
+                f"Protein-Hunter CSV {csv_path} missing 'best_seq' or 'sequence' column. "
+                f"Available: {list(df.columns[:10])}"
+            )
+
+        results: list[ExtractedBinder] = []
+        for idx, row in df.iterrows():
+            seq = str(row[seq_col]).strip().upper()
+            if not self._validate_sequence(seq):
+                continue
+            results.append(
+                ExtractedBinder(
+                    binder_id=self._make_id(row, int(idx)),
+                    sequence=seq,
+                    source_tool="protein_hunter",
+                    native=NativeMetrics(),
+                )
+            )
+        return results
+
+    def _find_csv(self, input_dir: Path, name: str) -> Path | None:
+        direct = input_dir / name
+        if direct.exists():
+            return direct
+        matches = list(input_dir.rglob(name))
+        return matches[0] if matches else None
+
+    def _make_id(self, row: pd.Series, fallback_idx: int) -> str:
+        run_id = row.get("run_id")
+        cycle = row.get("cycle")
+        if pd.notna(run_id) and pd.notna(cycle):
+            return f"protein_hunter_{run_id}_c{int(cycle)}"
+        if pd.notna(run_id):
+            return f"protein_hunter_{run_id}"
+        return f"protein_hunter_{fallback_idx}"
diff --git a/Evaluator/binder_comparison/extractors/proteina_complexa.py b/Evaluator/binder_comparison/extractors/proteina_complexa.py
index 29851b9..c8a4655 100644
--- a/Evaluator/binder_comparison/extractors/proteina_complexa.py
+++ b/Evaluator/binder_comparison/extractors/proteina_complexa.py
@@ -11,9 +11,10 @@
   - 'self_complex_pLDDT'     — AF2 pLDDT from internal eval
   - 'self_binder_scRMSD'     — binder self-consistency RMSD
 
-Note: Complexa's internal AF2 scores are used as a reward signal during
-generation. For cross-tool comparison, we re-fold everything with our
-standardised Boltz-2 and AF2 pipeline.
+Note: Complexa's internal AF2 scores (tool-internal, not Evaluator refolding)
+are used as a reward signal during generation. For cross-tool comparison, we
+re-fold every binder sequence with our standardised Boltz-2 pipeline (plus
+Protenix on x86 and AF3 on aarch64 where configured).
 """
 
 from __future__ import annotations
diff --git a/Evaluator/binder_comparison/extractors/pxdesign.py b/Evaluator/binder_comparison/extractors/pxdesign.py
index b5c4e81..e2fb230 100644
--- a/Evaluator/binder_comparison/extractors/pxdesign.py
+++ b/Evaluator/binder_comparison/extractors/pxdesign.py
@@ -63,8 +63,9 @@ def extract(self, input_dir: str | Path) -> list[ExtractedBinder]:
             binder_id = self._make_id(row, idx)
 
             # NativeMetrics left empty: PXDesign's own af2_iptm/ptx_iptm are
-            # biased (optimised against).  Our refolding provides independent
-            # assessment via standardised Boltz-2 and AF2 metrics.
+            # biased (optimised against). Our refolding provides independent
+            # assessment via standardised Boltz-2 metrics (plus Protenix/AF3
+            # where configured).
             results.append(
                 ExtractedBinder(
                     binder_id=binder_id,
diff --git a/Evaluator/binder_comparison/extractors/rfd3.py b/Evaluator/binder_comparison/extractors/rfd3.py
new file mode 100644
index 0000000..2ed7e77
--- /dev/null
+++ b/Evaluator/binder_comparison/extractors/rfd3.py
@@ -0,0 +1,124 @@
+"""RFD3 (foundry) sequence extractor.
+
+RFD3 (RosettaCommons/foundry, Butcher et al. 2025) replaces RFAA. The Hydra-
+driven `rfd3 design` CLI writes per-trajectory outputs beneath the out_dir,
+typically including PDB files and a results manifest.
+
+This extractor is defensive about the exact layout (the foundry output schema
+may tighten up in future releases):
+
+  1. Prefer a top-level CSV with a ``sequence`` column (common naming:
+     ``results.csv`` / ``designs.csv`` / ``rfd3_designs.csv``).
+  2. Fall back to scanning for ``*.pdb`` files alongside ``*.fasta`` sequence
+     manifests.
+  3. Emit a warning and return ``[]`` when neither pattern matches — this
+     lets the caller inform the user without raising.
+
+Sequences designed post-diffusion by RFD3's integrated ``foundry/models/mpnn``
+pass (ProteinMPNN / LigandMPNN) live in the same directory.
+"""
+
+from __future__ import annotations
+
+import warnings
+from pathlib import Path
+
+import pandas as pd
+
+from ..core.schema import ExtractedBinder, NativeMetrics
+from .base import SequenceExtractor
+
+_CSV_CANDIDATES = ["results.csv", "designs.csv", "rfd3_designs.csv", "summary.csv"]
+_SEQUENCE_COLS = ("sequence", "Sequence", "designed_sequence", "binder_sequence")
+
+
+class RFD3Extractor(SequenceExtractor):
+    """Extract binder sequences from an RFD3 / foundry output directory."""
+
+    @property
+    def tool_name(self) -> str:
+        return "rfd3"
+
+    def extract(self, input_dir: str | Path) -> list[ExtractedBinder]:
+        input_dir = Path(input_dir)
+        csv_results = self._extract_from_csv(input_dir)
+        if csv_results:
+            return csv_results
+        fasta_results = self._extract_from_fasta(input_dir)
+        if fasta_results:
+            return fasta_results
+        warnings.warn(f"RFD3: no CSV (tried {_CSV_CANDIDATES}) or *.fasta with sequences found under {input_dir}.")
+        return []
+
+    def _extract_from_csv(self, input_dir: Path) -> list[ExtractedBinder]:
+        csv_path = self._find_csv(input_dir)
+        if csv_path is None:
+            return []
+        df = pd.read_csv(csv_path)
+        seq_col = next((c for c in _SEQUENCE_COLS if c in df.columns), None)
+        if seq_col is None:
+            warnings.warn(
+                f"RFD3 CSV {csv_path} missing sequence column. Tried {_SEQUENCE_COLS}. "
+                f"Available: {list(df.columns[:10])}"
+            )
+            return []
+
+        results: list[ExtractedBinder] = []
+        for idx, row in df.iterrows():
+            seq = str(row[seq_col]).strip().upper()
+            if not self._validate_sequence(seq):
+                continue
+            results.append(
+                ExtractedBinder(
+                    binder_id=self._make_id(row, int(idx)),
+                    sequence=seq,
+                    source_tool="rfd3",
+                    native=NativeMetrics(),
+                )
+            )
+        return results
+
+    def _extract_from_fasta(self, input_dir: Path) -> list[ExtractedBinder]:
+        from ..io.read import read_fasta
+
+        fastas = list(input_dir.rglob("*.fasta")) + list(input_dir.rglob("*.fa"))
+        if not fastas:
+            return []
+
+        results: list[ExtractedBinder] = []
+        for fp in fastas:
+            try:
+                entries = read_fasta(fp)
+            except Exception:
+                continue
+            for idx, (header, seq) in enumerate(entries):
+                seq = seq.strip().upper()
+                if not self._validate_sequence(seq):
+                    continue
+                binder_id = header.split()[0] if header else f"rfd3_{fp.stem}_{idx}"
+                results.append(
+                    ExtractedBinder(
+                        binder_id=f"rfd3_{binder_id}",
+                        sequence=seq,
+                        source_tool="rfd3",
+                        native=NativeMetrics(),
+                    )
+                )
+        return results
+
+    def _find_csv(self, input_dir: Path) -> Path | None:
+        for name in _CSV_CANDIDATES:
+            direct = input_dir / name
+            if direct.exists():
+                return direct
+        for name in _CSV_CANDIDATES:
+            hits = list(input_dir.rglob(name))
+            if hits:
+                return hits[0]
+        return None
+
+    def _make_id(self, row: pd.Series, fallback_idx: int) -> str:
+        for key in ("design_id", "name", "run_id", "trajectory", "id"):
+            if key in row.index and pd.notna(row[key]):
+                return f"rfd3_{row[key]}"
+        return f"rfd3_{fallback_idx}"
diff --git a/Evaluator/binder_comparison/io/read.py b/Evaluator/binder_comparison/io/read.py
index 37b0be5..4e01102 100644
--- a/Evaluator/binder_comparison/io/read.py
+++ b/Evaluator/binder_comparison/io/read.py
@@ -388,7 +388,6 @@ def _cif_atom_site_seq(text: str, path: Path) -> str:
 def convert_cif_to_pdb(cif_path: str | Path, pdb_path: str | Path) -> Path:
     """Convert an mmCIF file to PDB format using BioPython.
 
-    BioPython is available in the binder-eval-af2 environment (via colabdesign).
     Raises ImportError if BioPython is not installed.
     """
     from Bio.PDB import PDBIO, MMCIFParser  # type: ignore
diff --git a/Evaluator/binder_comparison/main.py b/Evaluator/binder_comparison/main.py
index 84ecf6c..9e14b07 100644
--- a/Evaluator/binder_comparison/main.py
+++ b/Evaluator/binder_comparison/main.py
@@ -1,25 +1,27 @@
 """Entry point for the binder-compare CLI.
 
 Subcommands:
-    extract       — pull sequences from tool outputs
-    refold-boltz2 — refold with Boltz2 (run in 'mosaic' env)
-    refold-af2    — refold with AF2 (run in 'bindcraft_pr' env)
-    report        — merge, ensemble, normalise, generate HTML report
-    run           — full pipeline orchestrator
-    validate      — sanity-check input sequences before refolding
+    extract         — pull sequences from tool outputs
+    parse-seqs      — convert sequences from any format to FASTA
+    refold-boltz2   — refold with Boltz-2 (run in Mosaic venv)
+    refold-protenix — refold with Protenix v0.5.0 (run in bindmaster_pxdesign env)
+    report          — merge, normalise, generate HTML report
+    run             — full pipeline orchestrator
+    validate        — sanity-check input sequences before refolding
 """
 
 from __future__ import annotations
 
 import argparse
 
-from .cli import extract, parse_seqs, refold_af2, refold_boltz2, report, run, validate
+from .cli import extract, parse_seqs, refold_boltz2, refold_protenix, report, run, validate
 
 
 def main(argv=None) -> None:
     parser = argparse.ArgumentParser(
         prog="binder-compare",
-        description="Compare binder designs from BindCraft, BoltzGen, and Mosaic.",
+        description="Compare binder designs from BindCraft, BoltzGen, Mosaic, "
+        "PXDesign, Proteina-Complexa, and Protein Hunter.",
         formatter_class=argparse.RawDescriptionHelpFormatter,
     )
     parser.add_argument("--version", action="version", version="%(prog)s 0.1.0")
@@ -30,7 +32,7 @@ def main(argv=None) -> None:
     extract.add_parser(subparsers)
     parse_seqs.add_parser(subparsers)
     refold_boltz2.add_parser(subparsers)
-    refold_af2.add_parser(subparsers)
+    refold_protenix.add_parser(subparsers)
     report.add_parser(subparsers)
     run.add_parser(subparsers)
     validate.add_parser(subparsers)
diff --git a/Evaluator/binder_comparison/refolding/__init__.py b/Evaluator/binder_comparison/refolding/__init__.py
index 7dbb0ab..7e8cbe0 100644
--- a/Evaluator/binder_comparison/refolding/__init__.py
+++ b/Evaluator/binder_comparison/refolding/__init__.py
@@ -1,4 +1,4 @@
-from .af2_runner import run_af2_refold
 from .boltz2_runner import run_boltz2_refold
+from .protenix_runner import run_protenix_refold
 
-__all__ = ["run_af2_refold", "run_boltz2_refold"]
+__all__ = ["run_boltz2_refold", "run_protenix_refold"]
diff --git a/Evaluator/binder_comparison/refolding/af2_runner.py b/Evaluator/binder_comparison/refolding/af2_runner.py
deleted file mode 100644
index e2b4d1a..0000000
--- a/Evaluator/binder_comparison/refolding/af2_runner.py
+++ /dev/null
@@ -1,138 +0,0 @@
-"""AF2 refolding runner.
-
-Wraps Mosaic/refold_Version6.refold_batch_af2() to evaluate a batch of
-binder sequences against a target using AlphaFold2 (ColabDesign).
-
-Must be run in the 'bindcraft_pr' conda environment:
-    conda run -n bindcraft_pr binder-compare refold-af2 ...
-
-Output CSV columns (from refold_Version6):
-    run_id, idx, sequence, target_pdb, binder_length,
-    af2_iptm,
-    af2_plddt_binder_mean, af2_plddt_binder_min, af2_plddt_binder_max,
-    af2_plddt_target_mean,
-    af2_pae_bt_mean, af2_pae_tb_mean, af2_ipae,
-    af2_pae_bb_mean, af2_pae_tt_mean, af2_pae_overall_mean, af2_pae_max,
-    pdb,
-    af2_pae_file   ← path to PAE .npy; enables AF2 ipSAE via Dunbrack formula
-
-Array ordering: refold_Version6 uses [target | binder] ordering.
-PAE .npy files: same [target | binder] ordering — scoring.py reads with
-    ordering="target_binder" to normalise before ipSAE computation.
-"""
-
-from __future__ import annotations
-
-import sys
-from pathlib import Path
-
-
-def run_af2_refold(
-    sequences: list[str],
-    target_pdb_path: str | Path,
-    output_dir: str | Path,
-    output_csv: str | Path,
-    *,
-    models: list[int] | None = None,
-    num_recycles: int = 3,
-    mosaic_path: str | Path | None = None,
-    resume: bool = False,
-) -> None:
-    """Refold *sequences* against *target_pdb_path* using AF2 (ColabDesign).
-
-    Args:
-        sequences:       List of binder amino acid strings.
-        target_pdb_path: Path to the target PDB file (chain A used).
-        output_dir:      Directory where structure PDB files are written.
-        output_csv:      Path for the output CSV of metrics.
-        models:          AF2 model indices to use (default: [1]).
-        num_recycles:    Number of recycling iterations (default: 3).
-        mosaic_path:     Path to the Mosaic repo root. Auto-detected if None.
-        resume:          If True, skip binders already present in existing output CSV.
-    """
-    output_dir = Path(output_dir).resolve()
-    output_csv = Path(output_csv).resolve()
-    target_pdb_path = Path(target_pdb_path).resolve()
-    output_dir.mkdir(parents=True, exist_ok=True)
-    output_csv.parent.mkdir(parents=True, exist_ok=True)
-
-    if not target_pdb_path.exists():
-        raise FileNotFoundError(f"Target PDB not found: {target_pdb_path}")
-
-    skip_indices: set[int] = set()
-    if resume:
-        skip_indices = _load_completed_indices(output_csv)
-        if skip_indices:
-            print(f"[af2] Resuming — skipping {len(skip_indices)} already-completed binders")
-
-    # ColabDesign 1.1.1 uses jax.tree_* APIs removed in JAX 0.6.0.
-    # Restore them as shims so older code works transparently.
-    import jax
-
-    _tree_shims = {
-        "tree_map": "map",
-        "tree_leaves": "leaves",
-        "tree_flatten": "flatten",
-        "tree_unflatten": "unflatten",
-        "tree_structure": "structure",
-    }
-    for old_name, new_name in _tree_shims.items():
-        if not hasattr(jax, old_name):
-            setattr(jax, old_name, getattr(jax.tree, new_name))
-
-    # Import refold_Version6: try Mosaic root first, fall back to bundled copy
-    # in Evaluator/scripts/ (version-controlled).
-    mosaic_root = _resolve_mosaic_path(mosaic_path)
-    scripts_dir = str(Path(__file__).resolve().parent.parent.parent / "scripts")
-    sys.path.insert(0, str(mosaic_root))
-    sys.path.insert(1, scripts_dir)
-    from refold_Version6 import refold_batch_af2
-
-    # Filter out already-completed binders if resuming.
-    # refold_Version6 uses 1-based indexing so binder i maps to sequences[i-1].
-    if skip_indices:
-        filtered = [seq for i, seq in enumerate(sequences, 1) if i not in skip_indices]
-        print(f"[af2] After resume filter: {len(filtered)} of {len(sequences)} to process")
-        sequences = filtered
-
-    refold_batch_af2(
-        binder_sequences=sequences,
-        target_pdb_path=str(target_pdb_path),
-        output_dir=str(output_dir),
-        csv_path=str(output_csv),
-        models=models if models is not None else [0],
-        num_recycles=num_recycles,
-    )
-    print(f"[af2] Results → {output_csv}")
-
-
-def _load_completed_indices(csv_path: Path) -> set[int]:
-    """Read existing CSV and return a set of completed 1-based binder indices."""
-    if not csv_path.exists():
-        return set()
-    try:
-        import csv
-
-        indices: set[int] = set()
-        with open(csv_path) as f:
-            reader = csv.DictReader(f)
-            for row in reader:
-                idx_val = row.get("idx")
-                if idx_val is not None:
-                    indices.add(int(idx_val))
-        return indices
-    except Exception:
-        return set()
-
-
-def _resolve_mosaic_path(override: str | Path | None) -> Path:
-    if override is not None:
-        p = Path(override)
-        if not p.exists():
-            raise FileNotFoundError(f"Mosaic path not found: {p}")
-        return p
-    repo_root = Path(__file__).parent.parent.parent
-    candidate = repo_root / "Mosaic"
-    if candidate.exists():
-        return candidate
-    raise FileNotFoundError(f"Could not locate Mosaic directory at {candidate}. Pass --mosaic-path explicitly.")
diff --git a/Evaluator/binder_comparison/refolding/protenix_runner.py b/Evaluator/binder_comparison/refolding/protenix_runner.py
new file mode 100644
index 0000000..79dabb3
--- /dev/null
+++ b/Evaluator/binder_comparison/refolding/protenix_runner.py
@@ -0,0 +1,165 @@
+"""Protenix refolding runner.
+
+Wraps scripts/refold_protenix.refold_batch() to evaluate a batch of
+binder sequences against a target using Protenix v0.5.0 (ByteDance's open-
+source AlphaFold3 reimplementation).
+
+Must be run inside the ``bindmaster_pxdesign`` conda env, which ships
+Protenix pinned by the PXDesign installer:
+
+    conda run -n bindmaster_pxdesign binder-compare refold-protenix ...
+
+Output CSV columns (from refold_protenix, pLDDT rescaled to 0–1):
+    run_id, idx, sequence, target_sequence, binder_length,
+    iptm, ptm, ranking_score,
+    plddt_binder_mean, plddt_binder_min, plddt_target_mean,
+    pae_bt_mean, pae_tb_mean, pae_bb_mean, pae_overall_mean, pae_max,
+    cif, pdb, pae_file
+"""
+
+from __future__ import annotations
+
+import os
+import shutil
+import sys
+from pathlib import Path
+
+
+def run_protenix_refold(
+    sequences: list[str],
+    target_sequence: str,
+    output_dir: str | Path,
+    output_csv: str | Path,
+    *,
+    num_samples: int = 5,
+    num_seeds: int = 1,
+    use_msa: bool = False,
+    n_cycle: int = 10,
+    n_step: int = 200,
+    scripts_path: str | Path | None = None,
+    resume: bool = False,
+) -> None:
+    """Refold *sequences* against *target_sequence* using Protenix v0.5.0.
+
+    Args:
+        sequences:       Binder amino acid strings.
+        target_sequence: Target protein sequence.
+        output_dir:      Directory where Protenix output (predictions/, *.npy)
+                         is written.
+        output_csv:      Path for the metrics CSV.
+        num_samples:     Protenix diffusion samples per seed (default: 5).
+        num_seeds:       Number of random seeds, starting at 101 (default: 1).
+        use_msa:         Request ColabFold MMseqs2 MSAs? Default False — MSA-free
+                         inference is much faster and needs no internet access.
+        n_cycle:         Evoformer recycling iterations (Protenix default: 10).
+        n_step:          Diffusion steps (Protenix default: 200).
+        scripts_path:    Override path to scripts/ (auto-detected).
+        resume:          If True, skip binders with rows already in output_csv.
+    """
+    output_dir = Path(output_dir).resolve()
+    output_csv = Path(output_csv).resolve()
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    scripts_dir = _resolve_scripts_path(scripts_path)
+
+    skip_indices: set[int] = set()
+    if resume and output_csv.exists():
+        skip_indices = _load_completed_indices(output_csv)
+        if skip_indices:
+            print(f"[protenix] Resuming — skipping {len(skip_indices)} already-completed binders")
+
+    old_cwd = os.getcwd()
+    os.chdir(output_dir)
+    try:
+        sys.path.insert(0, str(scripts_dir))
+        from refold_protenix import refold_batch
+
+        refold_batch(
+            binder_sequences=sequences,
+            target_sequence=target_sequence,
+            output_dir=output_dir,
+            output_csv=output_csv,
+            num_samples=num_samples,
+            num_seeds=num_seeds,
+            use_msa=use_msa,
+            n_cycle=n_cycle,
+            n_step=n_step,
+            skip_indices=skip_indices,
+        )
+    finally:
+        os.chdir(old_cwd)
+
+    if not output_csv.exists():
+        raise FileNotFoundError(f"Expected refold_protenix to write {output_csv} but it was not found.")
+
+    # Absolutise CSV path columns so downstream tools (merger/report) can find artefacts.
+    _absolutize_csv_paths(output_csv, output_dir, ["cif", "pdb", "pae_file"])
+    print(f"[protenix] Results → {output_csv}")
+
+
+def _load_completed_indices(csv_path: Path) -> set[int]:
+    """Read existing CSV and return a set of already-populated 1-based indices."""
+    if not csv_path.exists():
+        return set()
+    try:
+        import csv
+
+        indices: set[int] = set()
+        with csv_path.open() as f:
+            reader = csv.DictReader(f)
+            for row in reader:
+                idx_val = row.get("idx")
+                if idx_val is not None:
+                    try:
+                        indices.add(int(idx_val))
+                    except ValueError:
+                        continue
+        return indices
+    except Exception:
+        return set()
+
+
+def _absolutize_csv_paths(csv_path: Path, base_dir: Path, path_cols: list[str]) -> None:
+    """Rewrite relative path columns in a CSV to absolute using *base_dir*."""
+    import csv as csv_mod
+
+    rows: list[dict[str, str]] = []
+    fieldnames: list[str] | None = None
+    with csv_path.open() as f:
+        reader = csv_mod.DictReader(f)
+        fieldnames = reader.fieldnames
+        for row in reader:
+            for col in path_cols:
+                val = row.get(col, "")
+                if val and not Path(val).is_absolute():
+                    row[col] = str((base_dir / val).resolve())
+            rows.append(row)
+
+    if fieldnames is None:
+        return
+    with csv_path.open("w", newline="") as f:
+        writer = csv_mod.DictWriter(f, fieldnames=fieldnames)
+        writer.writeheader()
+        writer.writerows(rows)
+
+
+def _resolve_scripts_path(override: str | Path | None) -> Path:
+    if override is not None:
+        p = Path(override)
+        if not p.exists():
+            raise FileNotFoundError(f"scripts path not found: {p}")
+        return p
+    # Default: <repo_root>/scripts (two levels up from this module)
+    repo_root = Path(__file__).parent.parent.parent
+    candidate = repo_root / "scripts"
+    if candidate.exists():
+        return candidate
+    raise FileNotFoundError(f"Could not locate scripts directory at {candidate}. Pass --scripts-path explicitly.")
+
+
+# Keep stale CIF/PAE outputs tidy
+def cleanup_stale_outputs(output_dir: str | Path) -> None:
+    """Remove Protenix's predictions/ tree to start from a clean state."""
+    pred = Path(output_dir) / "predictions"
+    if pred.exists():
+        shutil.rmtree(pred, ignore_errors=True)
diff --git a/Evaluator/binder_comparison/visualization/__init__.py b/Evaluator/binder_comparison/visualization/__init__.py
index 58d51fd..6f3dece 100644
--- a/Evaluator/binder_comparison/visualization/__init__.py
+++ b/Evaluator/binder_comparison/visualization/__init__.py
@@ -1,5 +1,4 @@
 from .plots import (
-    plot_af2_vs_boltz2_scatter,
     plot_metric_distributions,
     plot_pae_heatmaps,
     plot_plddt_curves,
@@ -9,7 +8,6 @@
 
 __all__ = [
     "generate_report",
-    "plot_af2_vs_boltz2_scatter",
     "plot_metric_distributions",
     "plot_pae_heatmaps",
     "plot_plddt_curves",
diff --git a/Evaluator/binder_comparison/visualization/plots.py b/Evaluator/binder_comparison/visualization/plots.py
index de4cb0b..b1026a0 100644
--- a/Evaluator/binder_comparison/visualization/plots.py
+++ b/Evaluator/binder_comparison/visualization/plots.py
@@ -21,7 +21,9 @@
     "mosaic": "#4CAF50",  # green
     "pxdesign": "#9C27B0",  # purple
     "proteina_complexa": "#795548",  # brown
-    "rfaa": "#C62828",  # red
+    "rfaa": "#C62828",  # red (legacy)
+    "rfd3": "#D84315",  # deep-orange (RFD3, current-gen RFAA replacement)
+    "protein_hunter": "#00838F",  # teal-cyan
     "unknown": "#9E9E9E",  # grey
 }
 
@@ -33,6 +35,8 @@
     "bindcraft": "BindCraft",
     "proteina_complexa": "Proteina-Complexa",
     "rfaa": "RFAA",
+    "rfd3": "RFD3",
+    "protein_hunter": "Protein-Hunter",
 }
 
 
@@ -51,19 +55,13 @@ def _tool_display(name: str) -> str:
     "boltz_pae_bt_ipsae": ("Boltz ipSAE B→T", "[0–1]", "↑"),
     "boltz_pae_tb_ipsae": ("Boltz ipSAE T→B", "[0–1]", "↑"),
     "boltz_pae_ipsae_min": ("Boltz ipSAE_min", "[0–1]", "↑"),
-    "af2_ipsae_min": ("AF2 ipSAE_min", "[0–1]", "↑"),
-    "af2_bt_ipsae": ("AF2 ipSAE (B→T)", "[0–1]", "↑"),
-    "af2_tb_ipsae": ("AF2 ipSAE (T→B)", "[0–1]", "↑"),
     "iptm": ("ipTM", "[0–1]", "↑"),
-    "af2_iptm": ("AF2 ipTM", "[0–1]", "↑"),
     "boltz_pae_iptm": ("Boltz ipTM (PAE)", "[0–1]", "↑"),
-    "af2_pae_iptm": ("AF2 ipTM (PAE)", "[0–1]", "↑"),
     "binder_ptm": ("Binder pTM", "[0–1]", "↑"),
     "plddt_binder_mean": ("pLDDT binder (mean)", "[0–1]", "↑"),
     "plddt_binder_min": ("pLDDT binder (min)", "[0–1]", "↑"),
     "plddt_target_mean": ("pLDDT target (mean)", "[0–1]", "↑"),
     "ipae": ("ipAE", "Å", "↓"),
-    "af2_ipae": ("AF2 ipAE", "Å", "↓"),
     "pae_bt": ("PAE (B→T)", "Å", "↓"),
     "pae_tb": ("PAE (T→B)", "Å", "↓"),
     "pae_bb": ("PAE (intra-B)", "Å", "↓"),
@@ -140,41 +138,37 @@ def plot_plddt_curves(
 
 def plot_pae_heatmaps(
     sequences: list[str],
-    af2_pae_data: dict[str, np.ndarray],
     boltz_pae_data: dict[str, np.ndarray],
     binder_lengths: dict[str, int],
     max_binders: int = 6,
 ) -> Figure:
-    """Side-by-side AF2 / Boltz2 PAE heatmaps for the top binders.
+    """Boltz-2 PAE heatmaps for the top binders.
 
-    Shows both models for each binder to make the comparison visual.
+    Additional engines (Protenix on x86, AF3 on aarch64) will be added as
+    extra columns in later refactor parts.
     """
-    seqs = [s for s in sequences if s in af2_pae_data or s in boltz_pae_data][:max_binders]
+    seqs = [s for s in sequences if s in boltz_pae_data][:max_binders]
     n = len(seqs)
     if n == 0:
         fig, ax = plt.subplots()
         ax.text(0.5, 0.5, "No PAE data available", ha="center", va="center")
         return fig
 
-    fig, axes = plt.subplots(n, 2, figsize=(10, 3 * n), squeeze=False)
+    fig, axes = plt.subplots(n, 1, figsize=(6, 3 * n), squeeze=False)
 
     for row_i, seq in enumerate(seqs):
         L_b = binder_lengths.get(seq, 0)
-        for col_i, (label, pae_dict) in enumerate([("AF2", af2_pae_data), ("Boltz2", boltz_pae_data)]):
-            ax = axes[row_i][col_i]
-            if seq in pae_dict:
-                pae = np.array(pae_dict[seq])
-                im = ax.imshow(pae, vmin=0, vmax=30, cmap="bwr", aspect="auto")
-                if L_b > 0 and L_b < pae.shape[0]:
-                    ax.axhline(L_b - 0.5, color="white", linewidth=1)
-                    ax.axvline(L_b - 0.5, color="white", linewidth=1)
-                plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04, label="PAE (Å)")
-            else:
-                ax.text(0.5, 0.5, "N/A", ha="center", va="center", transform=ax.transAxes)
-
-            ax.set_title(f"{label} — seq {row_i + 1}")
-            ax.set_xlabel("Residue j")
-            ax.set_ylabel("Residue i")
+        ax = axes[row_i][0]
+        pae = np.array(boltz_pae_data[seq])
+        im = ax.imshow(pae, vmin=0, vmax=30, cmap="bwr", aspect="auto")
+        if L_b > 0 and L_b < pae.shape[0]:
+            ax.axhline(L_b - 0.5, color="white", linewidth=1)
+            ax.axvline(L_b - 0.5, color="white", linewidth=1)
+        plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04, label="PAE (Å)")
+
+        ax.set_title(f"Boltz-2 — seq {row_i + 1}")
+        ax.set_xlabel("Residue j")
+        ax.set_ylabel("Residue i")
 
     fig.suptitle("PAE heatmaps (binder | target ordering)", y=1.01)
     fig.tight_layout()
@@ -184,12 +178,11 @@ def plot_pae_heatmaps(
 def load_pae_data_from_df(
     df: pd.DataFrame,
     max_binders: int = 5,
-) -> tuple[list[str], dict[str, np.ndarray], dict[str, np.ndarray], dict[str, int]]:
-    """Load PAE .npy files for top-ranked binders from DataFrame file paths.
+) -> tuple[list[str], dict[str, np.ndarray], dict[str, int]]:
+    """Load Boltz-2 PAE .npy files for top-ranked binders from DataFrame file paths.
 
-    Returns (sequences, af2_pae_data, boltz_pae_data, binder_lengths).
+    Returns (sequences, boltz_pae_data, binder_lengths).
     """
-    af2_pae_data: dict[str, np.ndarray] = {}
     boltz_pae_data: dict[str, np.ndarray] = {}
     binder_lengths: dict[str, int] = {}
     sequences: list[str] = []
@@ -199,7 +192,7 @@ def load_pae_data_from_df(
         if count >= max_binders:
             break
         seq = row.get("sequence", "")
-        if not seq or seq in af2_pae_data or seq in boltz_pae_data:
+        if not seq or seq in boltz_pae_data:
             continue
 
         L_b = row.get("binder_length")
@@ -207,8 +200,6 @@ def load_pae_data_from_df(
             L_b = len(seq)
         binder_lengths[seq] = int(L_b)
 
-        loaded_any = False
-
         # Load Boltz-2 PAE (binder|target ordering)
         boltz_path = row.get("boltz_pae_file")
         if boltz_path and not pd.isna(boltz_path):
@@ -216,36 +207,12 @@ def load_pae_data_from_df(
                 p = Path(str(boltz_path))
                 if p.exists():
                     boltz_pae_data[seq] = np.load(str(p))
-                    loaded_any = True
+                    sequences.append(seq)
+                    count += 1
             except Exception:
                 pass
 
-        # Load AF2 PAE (target|binder ordering → transpose to binder|target)
-        af2_path = row.get("af2_pae_file")
-        if af2_path and not pd.isna(af2_path):
-            try:
-                p = Path(str(af2_path))
-                if p.exists():
-                    pae = np.load(str(p))
-                    # Transpose from [target|binder] to [binder|target]
-                    L_t = pae.shape[0] - int(L_b)
-                    if L_t > 0:
-                        pae = np.block(
-                            [
-                                [pae[L_t:, L_t:], pae[L_t:, :L_t]],
-                                [pae[:L_t, L_t:], pae[:L_t, :L_t]],
-                            ]
-                        )
-                    af2_pae_data[seq] = pae
-                    loaded_any = True
-            except Exception:
-                pass
-
-        if loaded_any:
-            sequences.append(seq)
-            count += 1
-
-    return sequences, af2_pae_data, boltz_pae_data, binder_lengths
+    return sequences, boltz_pae_data, binder_lengths
 
 
 # ---------------------------------------------------------------------------
@@ -336,97 +303,6 @@ def plot_radar_chart(
     return fig
 
 
-# ---------------------------------------------------------------------------
-# AF2 vs Boltz2 scatter
-# ---------------------------------------------------------------------------
-
-
-def plot_af2_vs_boltz2_scatter(
-    df: pd.DataFrame,
-    metric_pairs: list[tuple[str, str]] | None = None,
-) -> Figure:
-    """Scatter plots of AF2 vs Boltz2 values for common metrics.
-
-    Args:
-        df:           Merged DataFrame with af2_* and boltz_* columns.
-        metric_pairs: List of (boltz_col, af2_col) pairs. Defaults to iptm and ipae.
-    """
-    if metric_pairs is None:
-        metric_pairs = [
-            ("ipsae_min", "af2_ipsae_min"),  # primary — most diagnostic
-            ("iptm", "af2_iptm"),
-        ]
-
-    n = len(metric_pairs)
-    fig, axes = plt.subplots(1, n, figsize=(5 * n, 5), squeeze=False)
-
-    for i, (b_col, a_col) in enumerate(metric_pairs):
-        ax = axes[0][i]
-        if b_col not in df.columns or a_col not in df.columns:
-            ax.text(0.5, 0.5, f"Missing:\n{b_col}\n{a_col}", ha="center", va="center", transform=ax.transAxes)
-            continue
-
-        b_vals = pd.to_numeric(df[b_col], errors="coerce")
-        a_vals = pd.to_numeric(df[a_col], errors="coerce")
-        mask = b_vals.notna() & a_vals.notna()
-
-        if "source_tool" in df.columns:
-            for tool, grp in df[mask].groupby("source_tool"):
-                colour = TOOL_COLOURS.get(tool, TOOL_COLOURS["unknown"])
-                ax.scatter(
-                    b_vals[grp.index], a_vals[grp.index], color=colour, alpha=0.7, s=30, label=_tool_display(tool)
-                )
-        else:
-            ax.scatter(b_vals[mask], a_vals[mask], alpha=0.7, s=30)
-
-        # Identity line
-        if mask.sum() == 0:
-            ax.text(0.5, 0.5, "No valid data points", ha="center", va="center", transform=ax.transAxes)
-            continue
-
-        lo = min(b_vals[mask].min(), a_vals[mask].min()) * 0.95
-        hi = max(b_vals[mask].max(), a_vals[mask].max()) * 1.05
-        ax.plot([lo, hi], [lo, hi], "k--", linewidth=1, alpha=0.4)
-
-        # Pearson r — prominent boxed annotation
-        if mask.sum() > 2:
-            r = np.corrcoef(b_vals[mask], a_vals[mask])[0, 1]
-            ax.text(
-                0.05,
-                0.95,
-                f"r = {r:.3f}",
-                transform=ax.transAxes,
-                va="top",
-                fontsize=11,
-                fontweight="bold",
-                bbox=dict(boxstyle="round,pad=0.3", facecolor="#fff9c4", edgecolor="#f57f17", linewidth=1.5),
-            )
-
-        # Axis labels: use METRIC_META if available
-        def _axis_label(col: str, role: str) -> str:
-            meta = METRIC_META.get(col)
-            if meta:
-                lbl, unit, arrow = meta
-                parts = [lbl]
-                if unit:
-                    parts.append(f"({unit})")
-                if arrow:
-                    parts.append(arrow)
-                return f"{role}: " + " ".join(parts)
-            return f"{role}: {col}"
-
-        ax.set_xlabel(_axis_label(b_col, "Boltz-2"), fontsize=9)
-        ax.set_ylabel(_axis_label(a_col, "AF2"), fontsize=9)
-        b_meta = METRIC_META.get(b_col)
-        title = b_meta[0] if b_meta else b_col.replace("_", " ")
-        ax.set_title(f"{title} — Boltz-2 vs AF2", fontsize=10)
-        if "source_tool" in df.columns:
-            ax.legend(fontsize=7)
-
-    fig.tight_layout()
-    return fig
-
-
 # ---------------------------------------------------------------------------
 # Metric distribution box plots
 # ---------------------------------------------------------------------------
diff --git a/Evaluator/binder_comparison/visualization/report.py b/Evaluator/binder_comparison/visualization/report.py
index 6857e24..3991bf1 100644
--- a/Evaluator/binder_comparison/visualization/report.py
+++ b/Evaluator/binder_comparison/visualization/report.py
@@ -3,7 +3,7 @@
 Produces a self-contained report.html with:
   - Summary table (top binders by composite score)
   - Per-tool summary statistics
-  - Embedded plots (metric distributions, radar chart, AF2 vs Boltz2 scatter)
+  - Embedded plots (metric distributions, radar chart)
   - Full metrics table (collapsible)
 """
 
@@ -30,6 +30,8 @@
     "bindcraft": "BindCraft",
     "proteina_complexa": "Proteina-Complexa",
     "rfaa": "RFAA",
+    "rfd3": "RFD3",
+    "protein_hunter": "Protein-Hunter",
 }
 
 
@@ -64,6 +66,8 @@ def _tool_display(name: str) -> str:
   .tool-pxdesign           {{ color: #7B1FA2; font-weight: bold; }}
   .tool-proteina_complexa  {{ color: #6D4C41; font-weight: bold; }}
   .tool-rfaa               {{ color: #C62828; font-weight: bold; }}
+  .tool-rfd3               {{ color: #D84315; font-weight: bold; }}
+  .tool-protein_hunter     {{ color: #00838F; font-weight: bold; }}
   .stat-table td {{ text-align: right; font-variant-numeric: tabular-nums; }}
   .stat-table th:first-child, .stat-table td:first-child {{ text-align: left; white-space: nowrap; }}
   img {{ max-width: 100%; margin: 1em 0; border: 1px solid #ccc; border-radius: 4px; }}
@@ -92,16 +96,17 @@ def _tool_display(name: str) -> str:
 
 <p style="font-size:0.85em;color:#555;line-height:1.6;">
   <b>Methodology.</b>
-  All designed binder sequences are independently refolded with <b>Boltz-2</b> (primary predictor)
-  and <b>AlphaFold2</b> (cross-validation) as target–binder complexes.
+  All designed binder sequences are independently refolded with <b>Boltz-2</b> as target–binder complexes.
+  Additional refolding engines (Protenix on x86, AlphaFold 3 on DGX Spark / aarch64) may contribute
+  to the agreement count when available.
   The primary ranking metric is <b>ipSAE_min</b> — the minimum of binder→target and target→binder
   interface Predicted Structural Alignment Error, computed using the
   <a href="https://github.com/DunbrackLab/IPSAE" target="_blank">DunbrackLab d0<sub>res</sub> formula</a>
-  (per-residue d0, uniform 10 Å PAE cutoff for both engines).
+  (per-residue d0, uniform 10 Å PAE cutoff for all engines).
   This metric showed 1.4× better average precision than ipAE across 3,766 experimentally tested
   designs in the <a href="https://doi.org/10.1101/2025.08.14.670059" target="_blank">Adaptyv/Overath et al. 2025</a>
   benchmark. Quality tiers and the 0.61 pass threshold follow their screening methodology.
-  <b>agreement_count</b> reports how many engines (0–2) score ipSAE_min above 0.61.
+  <b>agreement_count</b> reports how many refolding engines score ipSAE_min above 0.61.
   Ranking sorts by quality tier first, then agreement count, then ipSAE_min.
 </p>
 
@@ -127,7 +132,7 @@ def _tool_display(name: str) -> str:
     <div style="border-top:1px solid #ccc;padding-top:0.5em;margin-top:0.5em;
                 font-family:'Segoe UI',sans-serif;font-size:0.9em;color:#555;">
       <b>ipSAE_min</b> = min(ipSAE<sub>binder→target</sub>, ipSAE<sub>target→binder</sub>)
-      &nbsp;&nbsp;·&nbsp;&nbsp; cutoff = 10 Å (uniform for Boltz-2 and AF2)
+      &nbsp;&nbsp;·&nbsp;&nbsp; cutoff = 10 Å (uniform across all engines)
     </div>
   </div>
 </details>
@@ -152,11 +157,9 @@ def _tool_display(name: str) -> str:
   <tr><td style="padding:2px 12px 2px 0;"><b>quality_tier</b></td>
       <td>High (&gt;0.80), Medium (&gt;0.61), Low (&gt;0.40), Reject (≤0.40) based on ipSAE_min</td></tr>
   <tr><td style="padding:2px 12px 2px 0;"><b>agreement</b></td>
-      <td>Number of engines (0–2) with ipSAE_min &gt; 0.61</td></tr>
+      <td>Number of engines with ipSAE_min &gt; 0.61</td></tr>
   <tr><td style="padding:2px 12px 2px 0;"><b>ipSAE_min ↑</b></td>
       <td>Primary metric — min(binder→target, target→binder) iPSAE from Boltz-2 PAE [0–1]</td></tr>
-  <tr><td style="padding:2px 12px 2px 0;"><b>AF2 ipSAE_min ↑</b></td>
-      <td>Same metric from AlphaFold2 cross-validation [0–1]</td></tr>
   <tr><td style="padding:2px 12px 2px 0;"><b>ipTM ↑</b></td>
       <td>Interface predicted TM-score from Boltz-2 [0–1]</td></tr>
   <tr><td style="padding:2px 12px 2px 0;"><b>pLDDT binder ↑</b></td>
@@ -214,129 +217,15 @@ def _tool_display(name: str) -> str:
 """
 
 
-def _compute_af2_boltz2_r(df: pd.DataFrame) -> dict[str, float | int]:
-    """Compute Pearson r and systematic bias between Boltz-2 and AF2 metrics."""
-    import numpy as np
-
-    result: dict[str, float | int] = {}
-    for b_col, a_col in [("ipsae_min", "af2_ipsae_min"), ("iptm", "af2_iptm")]:
-        if b_col in df.columns and a_col in df.columns:
-            b = pd.to_numeric(df[b_col], errors="coerce")
-            a = pd.to_numeric(df[a_col], errors="coerce")
-            mask = b.notna() & a.notna()
-            n = int(mask.sum())
-            if n > 2:
-                bv, av = b[mask], a[mask]
-                r = float(np.corrcoef(bv, av)[0, 1])
-                result[f"{b_col}_vs_{a_col}"] = r
-                result[f"{b_col}_vs_{a_col}_n"] = n
-                # Systematic bias: mean difference and mean absolute error
-                result[f"{b_col}_mean"] = float(bv.mean())
-                result[f"{a_col}_mean"] = float(av.mean())
-                result[f"{b_col}_vs_{a_col}_mae"] = float((bv - av).abs().mean())
-    return result
-
-
-def _correlation_callout_html(corr: dict) -> str:
-    """Render a highlighted callout box for the AF2 vs Boltz-2 correlation."""
-    if not corr:
-        return ""
-
-    r_ipsae = corr.get("ipsae_min_vs_af2_ipsae_min")
-    n_ipsae = corr.get("ipsae_min_vs_af2_ipsae_min_n", 0)
-    r_iptm = corr.get("iptm_vs_af2_iptm")
-    n_iptm = corr.get("iptm_vs_af2_iptm_n", 0)
-
-    # Pick the most representative r value for the headline
-    headline_r = None
-    if r_ipsae is not None:
-        headline_r = r_ipsae
-    if r_iptm is not None and (headline_r is None or abs(r_iptm) > abs(headline_r)):
-        headline_r = r_iptm
-
-    if headline_r is None:
-        return ""
-
-    strength = "strong" if abs(headline_r) >= 0.7 else ("moderate" if abs(headline_r) >= 0.5 else "weak")
-
-    # Headline reflects the actual data
-    if strength == "strong":
-        headline_desc = "good rank-order agreement between the two predictors."
-    elif strength == "moderate":
-        headline_desc = "moderate rank-order correlation; absolute values may differ substantially."
-    else:
-        headline_desc = "weak correlation; the two predictors disagree on binder ranking."
-
-    # Detect systematic bias
-    bias_notes = []
-    for b_col, a_col, label in [
-        ("iptm", "af2_iptm", "ipTM"),
-        ("ipsae_min", "af2_ipsae_min", "ipSAE_min"),
-    ]:
-        b_mean = corr.get(f"{b_col}_mean")
-        a_mean = corr.get(f"{a_col}_mean")
-        mae = corr.get(f"{b_col}_vs_{a_col}_mae")
-        if b_mean is not None and a_mean is not None:
-            diff = b_mean - a_mean
-            if abs(diff) > 0.15:
-                higher = "Boltz-2" if diff > 0 else "AF2"
-                bias_notes.append(
-                    f"{label}: {higher} scores systematically higher "
-                    f"(Boltz-2 mean={b_mean:.3f}, AF2 mean={a_mean:.3f}, MAE={mae:.3f})"
-                )
-
-    # Correlation color: green for strong, amber for moderate, red for weak
-    r_colors = {"strong": "#1b5e20", "moderate": "#e65100", "weak": "#c62828"}
-    r_color = r_colors[strength]
-
-    lines = []
-    if r_iptm is not None:
-        s1 = "strong" if abs(r_iptm) >= 0.7 else ("moderate" if abs(r_iptm) >= 0.5 else "weak")
-        lines.append(
-            f"<strong>ipTM:</strong> Boltz-2 vs AF2 Pearson r = "
-            f"<strong style='font-size:1.25em; color:{r_color};'>{r_iptm:+.3f}</strong> "
-            f"(n = {n_iptm}) — {s1} rank-order correlation."
-        )
-    if r_ipsae is not None:
-        s2 = "strong" if abs(r_ipsae) >= 0.7 else ("moderate" if abs(r_ipsae) >= 0.5 else "weak")
-        lines.append(f"<strong>ipSAE_min:</strong> r = {r_ipsae:+.3f} (n = {n_ipsae}) — {s2}")
-
-    if bias_notes:
-        lines.append(
-            "<br><em style='color:#c62828;'>Systematic bias detected:</em> "
-            + "; ".join(bias_notes)
-            + ".<br><small>Note: AF2 (ColabDesign, single model, 3 recycles) is typically stricter "
-            "than Boltz-2 for de novo designs. Large absolute differences are common and do not "
-            "necessarily indicate errors.</small>"
-        )
-
-    # Callout style: green for strong, amber for moderate, muted for weak
-    if strength == "strong":
-        bg, border = "#e8f5e9", "#2e7d32"
-    elif strength == "moderate":
-        bg, border = "#fff8e1", "#f57f17"
-    else:
-        bg, border = "#fbe9e7", "#c62828"
-
-    inner = "<br>".join(lines)
-    return (
-        f'<div style="background:{bg}; border-left:5px solid {border}; '
-        f'padding:0.8em 1.2em; border-radius:4px; margin:1em 0; font-size:0.95em;">'
-        f'<p style="margin:0 0 0.4em 0; font-size:1.05em;">&#x1F4CA;&nbsp;'
-        f"<strong>Boltz-2 / AF2 Cross-Validation</strong>&nbsp;"
-        f"— {headline_desc}</p>"
-        f"<p style='margin:0;'>{inner}</p>"
-        f"</div>"
-    )
-
-
 _TOOL_COLOURS_NGL = {
     "mosaic": "#4CAF50",
     "pxdesign": "#9C27B0",
     "boltzgen": "#FF9800",
     "bindcraft": "#2196F3",
     "proteina_complexa": "#00897B",
-    "rfaa": "#D84315",
+    "rfaa": "#C62828",
+    "rfd3": "#D84315",
+    "protein_hunter": "#00838F",
     "unknown": "#9E9E9E",
 }
 
@@ -531,8 +420,8 @@ def _build_per_tool_pdb_viewer(
 
             for variant in name_variants:
                 candidates = list(tool_pdb_dir.rglob(pdb_pattern.format(name=variant)))
-                # Prefer PDB over CIF, and non-AF2 subdir over AF2
-                candidates = sorted(candidates, key=lambda p: ("AF2" in p.parts, p.suffix != ".pdb"))
+                # Prefer PDB over CIF
+                candidates = sorted(candidates, key=lambda p: p.suffix != ".pdb")
                 if candidates:
                     pdb_file = candidates[0]
                     break
@@ -852,11 +741,8 @@ def generate_report(
         "quality_tier",
         "agreement_count",
         "ipsae_min",
-        "af2_ipsae_min",
         "iptm",
-        "af2_iptm",
         "boltz_pae_iptm",
-        "af2_pae_iptm",
         "plddt_binder_mean",
         "plddt_binder_min",
         "binder_ptm",
@@ -917,14 +803,11 @@ def _select_display_cols(df: pd.DataFrame) -> tuple[list[str], list[str]]:
         "quality_tier",
         "agreement_count",
         "ipsae_min",
-        "af2_ipsae_min",
         "iptm",
         "plddt_binder_mean",
     ]
     secondary = [
-        "af2_iptm",
         "boltz_pae_iptm",
-        "af2_pae_iptm",
         "binder_ptm",
         "plddt_binder_min",
         "ipae",
@@ -1097,34 +980,14 @@ def fmt(col: str, val) -> str:
         "Primary ranking metric — ipSAE_min from independent Boltz-2 refolding (DunbrackLab formula, 10 Å cutoff). "
         "The sequence is refolded from scratch, so this is an unbiased assessment. Want >0.61."
     ),
-    "af2_ipsae_min": (
-        "ipSAE_min from independent AlphaFold2 refolding — cross-validation with a second prediction engine. "
-        "If both Boltz-2 and AF2 score >0.61, the design has dual-engine agreement (agreement_count = 2). "
-        "AF2 tends to score lower for computationally designed binders."
-    ),
-    "af2_bt_ipsae": (
-        "ipSAE binder→target from independent AF2 refolding. "
-        "Cross-validation metric — higher means AF2 also predicts binder contacts the target."
-    ),
-    "af2_tb_ipsae": (
-        "ipSAE target→binder from independent AF2 refolding. "
-        "Cross-validation metric — higher means AF2 also predicts target contacts the binder."
-    ),
     "iptm": (
         "Interface predicted TM-score from Boltz-2 — measures overall interface quality. "
-        "Higher = more confident complex prediction. Want >0.8. Note: can be inflated for AF2-designed sequences."
-    ),
-    "af2_iptm": (
-        "Interface predicted TM-score from AF2 refolding. "
-        "Low values are common for computationally designed binders — AF2 often struggles with de novo sequences."
+        "Higher = more confident complex prediction. Want >0.8."
     ),
     "boltz_pae_iptm": (
         "ipTM recomputed from Boltz-2 PAE matrix (rather than model-reported value). "
         "More consistent across runs. Higher = better."
     ),
-    "af2_pae_iptm": (
-        "ipTM recomputed from AF2 PAE matrix. More consistent than model-reported AF2 ipTM. Higher = better."
-    ),
     "binder_ptm": (
         "Predicted TM-score of the binder chain alone — does the binder fold into a stable structure by itself? "
         "Want >0.9. Low values suggest the binder may be disordered or misfolded."
@@ -1145,7 +1008,6 @@ def fmt(col: str, val) -> str:
         "Interface Predicted Aligned Error — mean PAE across the binder–target interface in Angstroms. "
         "Lower = more confident interface. Superseded by ipSAE as a ranking metric."
     ),
-    "af2_ipae": "Same as ipAE but from AF2 refolding. Lower = better.",
     "pae_bt": (
         "Mean Predicted Aligned Error from binder residues to target residues in Angstroms. "
         "Lower = Boltz-2 is more confident about binder→target spatial arrangement."
@@ -1159,8 +1021,9 @@ def fmt(col: str, val) -> str:
         "Reflects internal fold confidence. Lower = well-folded binder."
     ),
     "agreement_count": (
-        "Number of independent prediction engines (0–2) that score ipSAE_min above the 0.61 pass threshold. "
-        "0 = neither agrees, 1 = Boltz-2 only, 2 = both Boltz-2 and AF2 agree. Want 2."
+        "Number of independent prediction engines that score ipSAE_min above the 0.61 pass threshold. "
+        "Currently Boltz-2 only (0 or 1); Protenix (x86) and AlphaFold 3 (aarch64 / DGX Spark) "
+        "will be added as the refactor progresses. Higher = more engines agree = stronger candidate."
     ),
     "intra_contact": (
         "Binder internal contact score — measures how tightly the binder folds. "
diff --git a/Evaluator/envs/binder-eval-af2.yml b/Evaluator/envs/binder-eval-af2.yml
deleted file mode 100644
index b9e2d3c..0000000
--- a/Evaluator/envs/binder-eval-af2.yml
+++ /dev/null
@@ -1,11 +0,0 @@
-name: binder-eval-af2
-# AlphaFold2 refolding step.
-# Used for: binder-compare refold-af2
-# Requires Python 3.10 and ColabDesign — incompatible with Boltz-2 JAX version.
-# AF2 weights (~4 GB) must be downloaded separately; set AF2_DATA_DIR accordingly.
-channels:
-  - conda-forge
-  - defaults
-dependencies:
-  - python=3.10
-  - pip
diff --git a/Evaluator/evaluate.sh b/Evaluator/evaluate.sh
index e7dfa9f..ab5976e 100755
--- a/Evaluator/evaluate.sh
+++ b/Evaluator/evaluate.sh
@@ -4,18 +4,21 @@
 # Usage:
 #   bash evaluate.sh --sequences sequences.fasta \
 #                    --target-seq "MGFQKFSPF..." \
-#                    --target-pdb target.pdb \
 #                    --output ./results
 #
-# Both --target-seq and --target-pdb are required:
-#   --target-seq  full target amino acid sequence (for Boltz-2 complex assembly)
-#   --target-pdb  target PDB file (for AF2 multimer refolding)
+# Required:
+#   --sequences    path to FASTA (or CSV / one-per-line) with binder sequences
+#   --target-seq   full target amino acid sequence (for complex assembly)
+#   --output       output directory
 #
 # Optional:
-#   --skip-boltz2   skip Boltz-2 refolding (use existing boltz2_results.csv in output dir)
-#   --skip-af2      skip AF2 refolding     (use existing af2_results.csv in output dir)
-#   --resume        resume interrupted run — skip already-completed binders in both engines
-#   --mosaic-path   path to Mosaic repo root (for AF2's refold_Version6 module)
+#   --skip-boltz2       skip Boltz-2 refolding (use existing boltz2_results.csv)
+#   --skip-protenix     skip Protenix refolding (default: auto-detect bindmaster_pxdesign env)
+#   --protenix-env ENV  conda env for Protenix (default: bindmaster_pxdesign)
+#   --resume            resume interrupted run
+#
+# AF3 refolding (aarch64 / DGX Spark only, Part K) is driven separately by
+# `binder-compare refold-af3` and wired into the report via --af3-results.
 
 set -euo pipefail
 
@@ -56,26 +59,24 @@ fi
 
 SEQUENCES=""
 TARGET_SEQ=""
-TARGET_PDB=""
 OUTPUT=""
 SKIP_BOLTZ2=0
-SKIP_AF2=0
+SKIP_PROTENIX=0
+PROTENIX_ENV="bindmaster_pxdesign"
 RESUME=0
-MOSAIC_PATH=""
 
 # --- parse arguments -------------------------------------------------------
 while [[ $# -gt 0 ]]; do
     case "$1" in
-        --sequences)   SEQUENCES="$2";   shift 2 ;;
-        --target-seq)  TARGET_SEQ="$2";  shift 2 ;;
-        --target-pdb)  TARGET_PDB="$2";  shift 2 ;;
-        --output|-o)   OUTPUT="$2";      shift 2 ;;
-        --skip-boltz2) SKIP_BOLTZ2=1;    shift ;;
-        --skip-af2)    SKIP_AF2=1;       shift ;;
-        --resume)      RESUME=1;         shift ;;
-        --mosaic-path) MOSAIC_PATH="$2"; shift 2 ;;
+        --sequences)     SEQUENCES="$2";    shift 2 ;;
+        --target-seq)    TARGET_SEQ="$2";   shift 2 ;;
+        --output|-o)     OUTPUT="$2";       shift 2 ;;
+        --skip-boltz2)   SKIP_BOLTZ2=1;     shift ;;
+        --skip-protenix) SKIP_PROTENIX=1;   shift ;;
+        --protenix-env)  PROTENIX_ENV="$2"; shift 2 ;;
+        --resume)        RESUME=1;          shift ;;
         -h|--help)
-            sed -n '2,20p' "$0" | grep '^#' | sed 's/^# \?//'
+            sed -n '2,22p' "$0" | grep '^#' | sed 's/^# \?//'
             exit 0 ;;
         *) echo "Unknown argument: $1"; exit 1 ;;
     esac
@@ -84,35 +85,28 @@ done
 # --- validate ---------------------------------------------------------------
 [[ -z "$SEQUENCES" ]]  && { echo "Error: --sequences required"; exit 1; }
 [[ -z "$TARGET_SEQ" ]] && { echo "Error: --target-seq required"; exit 1; }
-[[ -z "$TARGET_PDB" ]] && { echo "Error: --target-pdb required"; exit 1; }
 [[ -z "$OUTPUT" ]]     && { echo "Error: --output required"; exit 1; }
 [[ -f "$SEQUENCES" ]]  || { echo "Error: sequences file not found: $SEQUENCES"; exit 1; }
-[[ -f "$TARGET_PDB" ]] || { echo "Error: target PDB not found: $TARGET_PDB"; exit 1; }
 
 mkdir -p "$OUTPUT"
 SEQUENCES="$(realpath "$SEQUENCES")"
-TARGET_PDB="$(realpath "$TARGET_PDB")"
 OUTPUT="$(realpath "$OUTPUT")"
 
-# Convert CIF to PDB if needed (AF2 requires PDB format)
-EXT="${TARGET_PDB##*.}"
-if [[ "${EXT,,}" == "cif" || "${EXT,,}" == "mmcif" ]]; then
-    echo "[setup] Converting CIF → PDB..."
-    CONVERTED_PDB="$OUTPUT/target.pdb"
-    conda run -n binder-eval-af2 python3 -c "
-from binder_comparison.io.read import convert_cif_to_pdb
-convert_cif_to_pdb('${TARGET_PDB//\'/\'\\\'\'}', '${CONVERTED_PDB//\'/\'\\\'\'}')
-print('[setup] Converted to', '${CONVERTED_PDB//\'/\'\\\'\'}')
-"
-    TARGET_PDB="$CONVERTED_PDB"
-fi
-
 echo "=== BindMaster Evaluator ==="
 echo "Sequences : $SEQUENCES"
-echo "Target PDB: $TARGET_PDB"
 echo "Output    : $OUTPUT"
 echo ""
 
+# Auto-detect Protenix availability unless user skipped it
+if [[ $SKIP_PROTENIX -eq 0 ]]; then
+    if ! conda env list 2>/dev/null | awk '{print $1}' | grep -qx "${PROTENIX_ENV}"; then
+        echo "[note] conda env '${PROTENIX_ENV}' not found — Protenix refolding will be skipped."
+        echo "        (install with: bindmaster install --tool pxdesign)"
+        echo ""
+        SKIP_PROTENIX=1
+    fi
+fi
+
 # --- Step 0: Normalise sequences to FASTA ----------------------------------
 FASTA="$OUTPUT/sequences.fasta"
 echo "[step 0] Parsing sequences → $FASTA"
@@ -123,12 +117,15 @@ SEQUENCES="$FASTA"
 
 # --- Step 1: Boltz-2 refolding ---------------------------------------------
 BOLTZ2_CSV="$OUTPUT/boltz2_results.csv"
+PROTENIX_CSV="$OUTPUT/protenix_results.csv"
+N_STEPS=2
+[[ $SKIP_PROTENIX -eq 0 ]] && N_STEPS=3
 
 if [[ $SKIP_BOLTZ2 -eq 1 ]]; then
-    echo "[step 1/3] Boltz-2 refolding — skipped (using existing $BOLTZ2_CSV)"
+    echo "[step 1/${N_STEPS}] Boltz-2 refolding — skipped (using existing $BOLTZ2_CSV)"
     [[ -f "$BOLTZ2_CSV" ]] || { echo "Error: $BOLTZ2_CSV not found"; exit 1; }
 else
-    echo "[step 1/3] Boltz-2 refolding  (Mosaic venv)..."
+    echo "[step 1/${N_STEPS}] Boltz-2 refolding  (Mosaic venv)..."
     BOLTZ2_RESUME_FLAG=""
     [[ $RESUME -eq 1 ]] && BOLTZ2_RESUME_FLAG="--resume"
     "$MOSAIC_VENV/bin/binder-compare" refold-boltz2 \
@@ -138,32 +135,30 @@ else
         $BOLTZ2_RESUME_FLAG
 fi
 
-# --- Step 2: AF2 refolding -------------------------------------------------
-AF2_CSV="$OUTPUT/af2_results.csv"
-
-if [[ $SKIP_AF2 -eq 1 ]]; then
-    echo "[step 2/3] AF2 refolding — skipped (using existing $AF2_CSV)"
-    [[ -f "$AF2_CSV" ]] || { echo "Error: $AF2_CSV not found"; exit 1; }
-else
-    echo "[step 2/3] AF2 refolding       (binder-eval-af2)..."
-    AF2_EXTRA_FLAGS=()
-    [[ $RESUME -eq 1 ]] && AF2_EXTRA_FLAGS+=(--resume)
-    [[ -n "$MOSAIC_PATH" ]] && AF2_EXTRA_FLAGS+=(--mosaic-path "$MOSAIC_PATH")
-    conda run -n binder-eval-af2 binder-compare refold-af2 \
+# --- Step 2: Protenix refolding (optional) ---------------------------------
+if [[ $SKIP_PROTENIX -eq 0 ]]; then
+    echo "[step 2/${N_STEPS}] Protenix refolding  (conda env: ${PROTENIX_ENV})..."
+    PROTENIX_RESUME_FLAG=""
+    [[ $RESUME -eq 1 ]] && PROTENIX_RESUME_FLAG="--resume"
+    conda run -n "${PROTENIX_ENV}" binder-compare refold-protenix \
         --sequences  "$SEQUENCES" \
-        --target-pdb "$TARGET_PDB" \
-        -o           "$AF2_CSV" \
-        --output-dir "$OUTPUT/refold_af2" \
-        "${AF2_EXTRA_FLAGS[@]}"
+        --target-seq "$TARGET_SEQ" \
+        -o           "$PROTENIX_CSV" \
+        --output-dir "$OUTPUT/refold_protenix" \
+        $PROTENIX_RESUME_FLAG
 fi
 
-# --- Step 3: Report --------------------------------------------------------
-echo "[step 3/3] Generating report   (binder-eval)..."
-conda run -n binder-eval binder-compare report \
-    --boltz2-results "$BOLTZ2_CSV" \
-    --af2-results    "$AF2_CSV" \
-    --sequences      "$SEQUENCES" \
+# --- Report ----------------------------------------------------------------
+echo "[step ${N_STEPS}/${N_STEPS}] Generating report   (binder-eval)..."
+REPORT_ARGS=(
+    --boltz2-results "$BOLTZ2_CSV"
+    --sequences      "$SEQUENCES"
     -o               "$OUTPUT/report"
+)
+if [[ $SKIP_PROTENIX -eq 0 && -f "$PROTENIX_CSV" ]]; then
+    REPORT_ARGS+=(--protenix-results "$PROTENIX_CSV")
+fi
+conda run -n binder-eval binder-compare report "${REPORT_ARGS[@]}"
 
 echo ""
 echo "=== Done ==="
diff --git a/Evaluator/install.sh b/Evaluator/install.sh
index 7dc37c1..0b2e319 100755
--- a/Evaluator/install.sh
+++ b/Evaluator/install.sh
@@ -3,19 +3,21 @@
 # Run once after cloning the repository:
 #   bash install.sh
 #
-# Creates two conda environments:
+# Creates one conda environment:
 #   binder-eval     extract + report  (lightweight, no ML)
-#   binder-eval-af2 AF2 refolding     (Python 3.10, ColabDesign)
 #
-# For Boltz-2 refolding the Mosaic environment from the BindMaster-installator
+# For Boltz-2 refolding the Mosaic environment from the BindMaster installer
 # is used. Mosaic must be installed first:
-#   cd /path/to/BindMaster-installator && bash install.sh --tool mosaic
+#   cd /path/to/BindMaster && bash install/install.sh --tool mosaic
+#
+# Future refolding engines (Protenix on x86, AF3 on aarch64 / DGX Spark) are
+# installed by the main BindMaster installer's `--tool protenix` and `--tool af3`
+# flags — not here.
 #
 # Prerequisites:
 #   - conda (miniforge/miniconda)
-#   - Mosaic installed via BindMaster-installator (provides Boltz-2)
+#   - Mosaic installed via BindMaster installer (provides Boltz-2)
 #   - GPU with CUDA drivers (required for refold steps)
-#   - AF2 weights downloaded to $AF2_DATA_DIR (for refold-af2 only)
 
 set -euo pipefail
 
@@ -40,22 +42,20 @@ echo "Repo: $REPO_DIR"
 echo ""
 
 # ---------------------------------------------------------------------------
-# 0. Locate Mosaic venv (created by BindMaster-installator)
+# 0. Locate Mosaic venv (created by BindMaster installer)
 # ---------------------------------------------------------------------------
-echo "[0/3] Locating Mosaic venv (Boltz-2 environment)..."
+echo "[0/2] Locating Mosaic venv (Boltz-2 environment)..."
 
 MOSAIC_VENV=""
 
-# Check a user-supplied path first
 if [[ -n "${MOSAIC_DIR:-}" && -f "$MOSAIC_DIR/.venv/bin/python" ]]; then
     MOSAIC_VENV="$MOSAIC_DIR/.venv"
 fi
 
-# Auto-detect common locations
 if [[ -z "$MOSAIC_VENV" ]]; then
     for _candidate in \
-        "$(dirname "$REPO_DIR")/BindMaster-installator/Mosaic/.venv" \
-        "${HOME}/Documents/BindMaster/BindMaster-installator/Mosaic/.venv" \
+        "$(dirname "$REPO_DIR")/Mosaic/.venv" \
+        "${HOME}/Documents/BindMaster/Mosaic/.venv" \
         "${HOME}/BindMaster/Mosaic/.venv"; do
         if [[ -f "$_candidate/bin/python" ]]; then
             MOSAIC_VENV="$_candidate"
@@ -69,26 +69,19 @@ if [[ -z "$MOSAIC_VENV" ]]; then
     echo "  ERROR: Could not find the Mosaic virtual environment."
     echo ""
     echo "  The Boltz-2 refolding step uses the Mosaic environment"
-    echo "  created by the BindMaster-installator. Please install it first:"
+    echo "  created by the BindMaster installer. Please install it first:"
     echo ""
-    echo "    cd /path/to/BindMaster-installator"
-    echo "    bash install.sh --tool mosaic"
+    echo "    cd /path/to/BindMaster"
+    echo "    bash install/install.sh --tool mosaic"
     echo ""
     echo "  Then re-run this script, or set MOSAIC_DIR before running:"
-    echo "    MOSAIC_DIR=/path/to/BindMaster-installator/Mosaic bash install.sh"
+    echo "    MOSAIC_DIR=/path/to/BindMaster/Mosaic bash install.sh"
     echo ""
     exit 1
 fi
 
 echo "      Found Mosaic venv: $MOSAIC_VENV"
 
-# Copy refold_Version6.py (AF2 refolding module) into Mosaic root
-MOSAIC_ROOT="$(dirname "$MOSAIC_VENV")"
-if [[ -f "$REPO_DIR/scripts/refold_Version6.py" ]]; then
-    cp "$REPO_DIR/scripts/refold_Version6.py" "$MOSAIC_ROOT/refold_Version6.py"
-    echo "      Copied refold_Version6.py → $MOSAIC_ROOT/"
-fi
-
 # Install binder-compare into the Mosaic venv
 echo "      Installing binder-compare into Mosaic venv..."
 # shellcheck disable=SC1087
@@ -103,33 +96,24 @@ echo "      Saved venv path → $REPO_DIR/envs/mosaic_venv_path"
 # ---------------------------------------------------------------------------
 # 1. binder-eval  (extract + report)
 # ---------------------------------------------------------------------------
-echo "[1/3] Creating binder-eval..."
+echo "[1/2] Creating binder-eval..."
 conda env create -f "$REPO_DIR/envs/binder-eval.yml" --yes 2>/dev/null || \
     conda env update -f "$REPO_DIR/envs/binder-eval.yml" --prune
 # shellcheck disable=SC1087
 conda run -n binder-eval pip install -q -e "$REPO_DIR[report]"
 echo "      binder-compare version: $(conda run -n binder-eval binder-compare --version)"
 
-# ---------------------------------------------------------------------------
-# 2. binder-eval-af2  (AF2 refolding)
-# ---------------------------------------------------------------------------
-echo "[2/3] Creating binder-eval-af2..."
-conda env create -f "$REPO_DIR/envs/binder-eval-af2.yml" --yes 2>/dev/null || \
-    conda env update -f "$REPO_DIR/envs/binder-eval-af2.yml" --prune
-# shellcheck disable=SC1087
-conda run -n binder-eval-af2 pip install -q colabdesign==1.1.1 -e "$REPO_DIR[af2]"
-echo "      binder-compare version: $(conda run -n binder-eval-af2 binder-compare --version)"
-
 # ---------------------------------------------------------------------------
 echo ""
 echo "=== Installation complete ==="
 echo ""
 echo "  Boltz-2 (Mosaic venv): $MOSAIC_VENV"
 echo "  Extract/report:        conda env binder-eval"
-echo "  AF2 refolding:         conda env binder-eval-af2"
 echo ""
 echo "Usage:"
-echo "  bash evaluate.sh --sequences seqs.fasta --target-pdb target.pdb --output ./results"
+echo "  bash evaluate.sh --sequences seqs.fasta --target-seq SEQ --output ./results"
 echo ""
-echo "Note: AF2 weights (~4 GB) must be present at \$AF2_DATA_DIR."
-echo "      See docs/pipeline_reference.md for the expected path."
+echo "Note: additional refolding engines (Protenix on x86, AlphaFold 3 on"
+echo "      aarch64 / DGX Spark) are installed via the main BindMaster"
+echo "      installer (--tool protenix / --tool af3) and will be wired into"
+echo "      the evaluate.sh orchestrator by later refactor parts."
diff --git a/Evaluator/pyproject.toml b/Evaluator/pyproject.toml
index 4b9c0c7..032ab1a 100644
--- a/Evaluator/pyproject.toml
+++ b/Evaluator/pyproject.toml
@@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "binder-comparison"
 version = "0.1.0"
-description = "Evaluate protein binder sequences by independent refolding with Boltz-2 and AlphaFold2 — sequences in, ranked interface metrics out."
+description = "Evaluate protein binder sequences by independent refolding with Boltz-2 (plus Protenix on x86 and AlphaFold 3 on aarch64 / DGX Spark when configured) — sequences in, ranked interface metrics out."
 requires-python = ">=3.10"
 dependencies = [
     "numpy>=1.24",
@@ -14,15 +14,11 @@ dependencies = [
 ]
 
 [project.optional-dependencies]
-# Install in the 'mosaic' conda env for Boltz2 refolding
+# Install in the Mosaic venv for Boltz-2 refolding
 boltz2 = [
     "jax",
     "equinox",
 ]
-# Install in the 'bindcraft_pr' conda env for AF2 refolding
-af2 = [
-    "colabdesign",
-]
 # Full installation (for the report step only — no heavy ML deps)
 report = [
     "jinja2>=3.0",
diff --git a/Evaluator/scripts/refold_Version6.py b/Evaluator/scripts/refold_Version6.py
deleted file mode 100644
index b824c07..0000000
--- a/Evaluator/scripts/refold_Version6.py
+++ /dev/null
@@ -1,212 +0,0 @@
-"""AF2 binder cross-evaluator using ColabDesign.
-
-Run with:  conda run -n bindcraft_pr python run_v6_protenix.py
-           conda run -n bindcraft_pr python run_v6_mosaic.py
-
-Binder protocol: target first (indices 0:L_t), binder second (indices L_t:L_t+L_b).
-"""
-
-import csv
-import os
-import uuid
-
-import numpy as np
-from colabdesign import mk_af_model
-
-OUTPUT_DIR = "af2_structures"
-
-
-def _find_af2_data_dir():
-    """Locate AF2 weights: $AF2_DATA_DIR → BindMaster/BindCraft/params → Evaluator/../BindCraft/params."""
-    env = os.environ.get("AF2_DATA_DIR")
-    if env and os.path.isdir(env):
-        return env
-    # Walk up from this script to find BindCraft/params
-    d = os.path.dirname(os.path.abspath(__file__))
-    for _ in range(5):
-        candidate = os.path.join(d, "BindCraft", "params")
-        if os.path.isdir(candidate):
-            return candidate
-        d = os.path.dirname(d)
-    return os.environ.get("AF2_DATA_DIR", "")
-
-
-AF2_DATA_DIR = _find_af2_data_dir()
-CSV_PATH = "af2_eval.csv"
-
-CSV_COLUMNS = [
-    "run_id",
-    "idx",
-    "sequence",
-    "target_pdb",
-    "binder_length",
-    "af2_iptm",
-    "af2_plddt_binder_mean",
-    "af2_plddt_binder_min",
-    "af2_plddt_binder_max",
-    "af2_plddt_target_mean",
-    "af2_pae_bt_mean",
-    "af2_pae_tb_mean",
-    "af2_ipae",
-    "af2_pae_bb_mean",
-    "af2_pae_tt_mean",
-    "af2_pae_overall_mean",
-    "af2_pae_max",
-    "pdb",
-    "af2_pae_file",  # path to saved PAE matrix (.npy); enables AF2 ipSAE computation
-]
-
-
-def refold_batch_af2(
-    binder_sequences: list,
-    target_pdb_path: str,
-    *,
-    models: list | None = None,
-    num_recycles: int = 3,
-    output_dir: str = OUTPUT_DIR,
-    csv_path: str = CSV_PATH,
-):
-    """Evaluate a batch of binder sequences with AF2 multimer (ColabDesign binder protocol).
-
-    For each binder:
-      - Runs AF2 multimer prediction (target first, binder second ordering)
-      - Extracts iptm, plddt, and pae metrics sliced into binder/target regions
-      - Saves PDB and writes a CSV row immediately (incremental)
-    """
-    if models is None:
-        models = [0]
-
-    run_id = str(uuid.uuid4())[:8]
-    os.makedirs(output_dir, exist_ok=True)
-
-    print("\n=== AF2 Binder Cross-Evaluator (Version 6) ===")
-    print(f"Run ID: {run_id}")
-    print(f"Target PDB: {target_pdb_path}")
-    print(f"Binders to evaluate: {len(binder_sequences)}")
-    print(f"AF2 models: {models}  num_recycles: {num_recycles}")
-    print(f"Output directory: {output_dir}\n")
-
-    # Load AF2 model weights once, reuse across all binders
-    print("Loading AF2 model weights...")
-    af = mk_af_model(protocol="binder", use_multimer=True, data_dir=AF2_DATA_DIR)
-    print("AF2 model ready.\n")
-
-    write_header = (not os.path.exists(csv_path)) or os.path.getsize(csv_path) == 0
-    csv_file = open(csv_path, "a", newline="")
-    csv_writer = csv.DictWriter(csv_file, fieldnames=CSV_COLUMNS)
-    if write_header:
-        csv_writer.writeheader()
-        csv_file.flush()
-
-    for idx, seq in enumerate(binder_sequences, start=1):
-        L_b = len(seq)
-        print(f"{'─' * 55}")
-        print(f"[{idx}/{len(binder_sequences)}] length={L_b} aa  seq={seq}")
-
-        af.prep_inputs(pdb_filename=target_pdb_path, chain="A", binder_len=L_b)
-        af.set_seq(seq)
-        af.predict(models=models, num_recycles=num_recycles)
-
-        # Diagnostic dump on first binder to catch API surprises early
-        if idx == 1:
-            print(f"  [debug] af.aux keys: {sorted(af.aux.keys())}")
-            log_val = af.aux.get("log", {})
-            log_keys = sorted(log_val.keys()) if isinstance(log_val, dict) else type(log_val)
-            print(f"  [debug] af.aux['log'] keys: {log_keys}")
-            print(f"  [debug] af._inputs keys: {sorted(af._inputs.keys())}")
-
-        # Determine target length — prefer explicit field, fall back to array arithmetic
-        L_t = af._inputs.get("target_length", None)
-        if L_t is None:
-            total_len = len(af.aux["plddt"])
-            L_t = total_len - L_b
-        L_t = int(L_t)
-
-        # Slice arrays: target [0:L_t], binder [L_t:L_t+L_b]
-        plddt = np.array(af.aux["plddt"])
-        pae = np.array(af.aux["pae"])
-
-        plddt_t = plddt[:L_t]
-        plddt_b = plddt[L_t:]
-        pae_tt = pae[:L_t, :L_t]
-        pae_bt = pae[L_t:, :L_t]  # binder rows → target cols
-        pae_tb = pae[:L_t, L_t:]  # target rows → binder cols
-        pae_bb = pae[L_t:, L_t:]
-
-        # Interface iptm — prefer top-level key, fall back to log dict
-        if "i_ptm" in af.aux:
-            af2_iptm = float(af.aux["i_ptm"])
-        else:
-            log = af.aux.get("log", {})
-            af2_iptm = float(log.get("i_ptm", log.get("iptm", float("nan")))) if isinstance(log, dict) else float("nan")
-
-        # pLDDT statistics
-        af2_plddt_binder_mean = float(plddt_b.mean()) if plddt_b.size > 0 else float("nan")
-        af2_plddt_binder_min = float(plddt_b.min()) if plddt_b.size > 0 else float("nan")
-        af2_plddt_binder_max = float(plddt_b.max()) if plddt_b.size > 0 else float("nan")
-        af2_plddt_target_mean = float(plddt_t.mean()) if plddt_t.size > 0 else float("nan")
-
-        # PAE statistics
-        af2_pae_bt_mean = float(pae_bt.mean()) if pae_bt.size > 0 else float("nan")
-        af2_pae_tb_mean = float(pae_tb.mean()) if pae_tb.size > 0 else float("nan")
-        af2_ipae = (af2_pae_bt_mean + af2_pae_tb_mean) / 2.0
-        af2_pae_bb_mean = float(pae_bb.mean()) if pae_bb.size > 0 else float("nan")
-        af2_pae_tt_mean = float(pae_tt.mean()) if pae_tt.size > 0 else float("nan")
-        af2_pae_overall_mean = float(pae.mean())
-        af2_pae_max = float(pae.max())
-
-        # Save structure
-        pdb_path = f"{output_dir}/af2_{idx}_{run_id}.pdb"
-        af.save_pdb(pdb_path)
-
-        # Save PAE matrix for downstream ipSAE computation.
-        # Array is in native AF2 ordering: [target | binder].
-        # binder_comparison/comparison/scoring.py reads this with ordering="target_binder".
-        pae_path = f"{output_dir}/af2_{idx}_{run_id}_pae.npy"
-        np.save(pae_path, pae)
-
-        # Console summary
-        print(
-            f"  Interface:   iptm={af2_iptm:.4f}  ipae={af2_ipae:.4f}  pae_bt={af2_pae_bt_mean:.4f}  pae_tb={af2_pae_tb_mean:.4f}"
-        )
-        print(
-            f"  Binder:      plddt_mean={af2_plddt_binder_mean:.4f}  plddt_min={af2_plddt_binder_min:.4f}  pae_bb={af2_pae_bb_mean:.4f}"
-        )
-        print(f"  Target:      plddt_mean={af2_plddt_target_mean:.4f}  pae_tt={af2_pae_tt_mean:.4f}")
-        print(f"  PAE overall: mean={af2_pae_overall_mean:.4f}  max={af2_pae_max:.4f}")
-        print(f"  PDB: {pdb_path}")
-        print(f"  PAE: {pae_path}")
-
-        row = {
-            "run_id": run_id,
-            "idx": idx,
-            "sequence": seq,
-            "target_pdb": target_pdb_path,
-            "binder_length": L_b,
-            "af2_iptm": af2_iptm,
-            "af2_plddt_binder_mean": af2_plddt_binder_mean,
-            "af2_plddt_binder_min": af2_plddt_binder_min,
-            "af2_plddt_binder_max": af2_plddt_binder_max,
-            "af2_plddt_target_mean": af2_plddt_target_mean,
-            "af2_pae_bt_mean": af2_pae_bt_mean,
-            "af2_pae_tb_mean": af2_pae_tb_mean,
-            "af2_ipae": af2_ipae,
-            "af2_pae_bb_mean": af2_pae_bb_mean,
-            "af2_pae_tt_mean": af2_pae_tt_mean,
-            "af2_pae_overall_mean": af2_pae_overall_mean,
-            "af2_pae_max": af2_pae_max,
-            "pdb": pdb_path,
-            "af2_pae_file": pae_path,
-        }
-        csv_writer.writerow(row)
-        csv_file.flush()
-
-    csv_file.close()
-
-    print(f"\n{'=' * 55}")
-    print("=== Run Complete ===")
-    print(f"Processed {len(binder_sequences)} binder(s).")
-    print(f"Results → {csv_path}")
-    print(f"PDB     → {output_dir}/af2_*_{run_id}.pdb")
-    print(f"PAE     → {output_dir}/af2_*_{run_id}_pae.npy  (for ipSAE computation)")
-    print(f"Run ID: {run_id} (for tracking this session)")
diff --git a/Evaluator/scripts/refold_af2.py b/Evaluator/scripts/refold_af2.py
deleted file mode 100644
index e7a3271..0000000
--- a/Evaluator/scripts/refold_af2.py
+++ /dev/null
@@ -1,210 +0,0 @@
-"""AF2 binder cross-evaluator using ColabDesign.
-
-Run with:  conda run -n bindcraft_pr python run_v6_protenix.py
-           conda run -n bindcraft_pr python run_v6_mosaic.py
-
-Binder protocol: target first (indices 0:L_t), binder second (indices L_t:L_t+L_b).
-"""
-
-import csv
-import os
-import uuid
-
-import numpy as np
-from colabdesign import mk_af_model
-
-OUTPUT_DIR = "af2_structures"
-AF2_DATA_DIR = os.environ.get("AF2_DATA_DIR", "/opt/bindmaster/af2_params")
-CSV_PATH = "af2_eval.csv"
-
-CSV_COLUMNS = [
-    "run_id",
-    "idx",
-    "sequence",
-    "target_pdb",
-    "binder_length",
-    "af2_iptm",
-    "af2_plddt_binder_mean",
-    "af2_plddt_binder_min",
-    "af2_plddt_binder_max",
-    "af2_plddt_target_mean",
-    "af2_pae_bt_mean",
-    "af2_pae_tb_mean",
-    "af2_ipae",
-    "af2_pae_bb_mean",
-    "af2_pae_tt_mean",
-    "af2_pae_overall_mean",
-    "af2_pae_max",
-    "pdb",
-    "af2_pae_file",  # path to saved PAE matrix (.npy); enables AF2 ipSAE computation
-]
-
-
-def refold_batch_af2(
-    binder_sequences: list,
-    target_pdb_path: str,
-    *,
-    models: list | None = None,
-    num_recycles: int = 3,
-    output_dir: str = OUTPUT_DIR,
-    csv_path: str = CSV_PATH,
-    skip_indices: set | None = None,
-):
-    """Evaluate a batch of binder sequences with AF2 multimer (ColabDesign binder protocol).
-
-    For each binder:
-      - Runs AF2 multimer prediction (target first, binder second ordering)
-      - Extracts iptm, plddt, and pae metrics sliced into binder/target regions
-      - Saves PDB and writes a CSV row immediately (incremental)
-
-    Args:
-        skip_indices: Set of 1-based binder indices to skip (already completed).
-                      When resuming, pass indices read from existing CSV.
-    """
-    if skip_indices is None:
-        skip_indices = set()
-    if models is None:
-        models = [1]
-
-    run_id = str(uuid.uuid4())[:8]
-    os.makedirs(output_dir, exist_ok=True)
-
-    print("\n=== AF2 Binder Cross-Evaluator (Version 6) ===")
-    print(f"Run ID: {run_id}")
-    print(f"Target PDB: {target_pdb_path}")
-    print(f"Binders to evaluate: {len(binder_sequences)}")
-    print(f"AF2 models: {models}  num_recycles: {num_recycles}")
-    print(f"Output directory: {output_dir}\n")
-
-    # Load AF2 model weights once, reuse across all binders
-    print("Loading AF2 model weights...")
-    af = mk_af_model(protocol="binder", use_multimer=True, data_dir=AF2_DATA_DIR)
-    print("AF2 model ready.\n")
-
-    write_header = (not os.path.exists(csv_path)) or os.path.getsize(csv_path) == 0
-    csv_file = open(csv_path, "a", newline="")
-    try:
-        csv_writer = csv.DictWriter(csv_file, fieldnames=CSV_COLUMNS)
-        if write_header:
-            csv_writer.writeheader()
-            csv_file.flush()
-
-        for idx, seq in enumerate(binder_sequences, start=1):
-            if idx in skip_indices:
-                print(f"[SKIP] Binder #{idx} already completed")
-                continue
-
-            L_b = len(seq)
-            print(f"{'─' * 55}")
-            print(f"[{idx}/{len(binder_sequences)}] length={L_b} aa  seq={seq}")
-
-            af.prep_inputs(pdb_filename=target_pdb_path, chain="A", binder_len=L_b)
-            af.set_seq(seq)
-            af.predict(models=models, num_recycles=num_recycles)
-
-            # Diagnostic dump on first binder to catch API surprises early
-            if idx == 1:
-                print(f"  [debug] af.aux keys: {sorted(af.aux.keys())}")
-                log_val = af.aux.get("log", {})
-                log_keys = sorted(log_val.keys()) if isinstance(log_val, dict) else type(log_val)
-                print(f"  [debug] af.aux['log'] keys: {log_keys}")
-                print(f"  [debug] af._inputs keys: {sorted(af._inputs.keys())}")
-
-            # Determine target length — prefer explicit field, fall back to array arithmetic
-            L_t = af._inputs.get("target_length", None)
-            if L_t is None:
-                total_len = len(af.aux["plddt"])
-                L_t = total_len - L_b
-            L_t = int(L_t)
-
-            # Slice arrays: target [0:L_t], binder [L_t:L_t+L_b]
-            plddt = np.array(af.aux["plddt"])
-            pae = np.array(af.aux["pae"])
-
-            plddt_t = plddt[:L_t]
-            plddt_b = plddt[L_t:]
-            pae_tt = pae[:L_t, :L_t]
-            pae_bt = pae[L_t:, :L_t]  # binder rows → target cols
-            pae_tb = pae[:L_t, L_t:]  # target rows → binder cols
-            pae_bb = pae[L_t:, L_t:]
-
-            # Interface iptm — prefer top-level key, fall back to log dict
-            if "i_ptm" in af.aux:
-                af2_iptm = float(af.aux["i_ptm"])
-            else:
-                log = af.aux.get("log", {})
-                af2_iptm = (
-                    float(log.get("i_ptm", log.get("iptm", float("nan")))) if isinstance(log, dict) else float("nan")
-                )
-
-            # pLDDT statistics
-            af2_plddt_binder_mean = float(plddt_b.mean()) if plddt_b.size > 0 else float("nan")
-            af2_plddt_binder_min = float(plddt_b.min()) if plddt_b.size > 0 else float("nan")
-            af2_plddt_binder_max = float(plddt_b.max()) if plddt_b.size > 0 else float("nan")
-            af2_plddt_target_mean = float(plddt_t.mean()) if plddt_t.size > 0 else float("nan")
-
-            # PAE statistics
-            af2_pae_bt_mean = float(pae_bt.mean()) if pae_bt.size > 0 else float("nan")
-            af2_pae_tb_mean = float(pae_tb.mean()) if pae_tb.size > 0 else float("nan")
-            af2_ipae = (af2_pae_bt_mean + af2_pae_tb_mean) / 2.0
-            af2_pae_bb_mean = float(pae_bb.mean()) if pae_bb.size > 0 else float("nan")
-            af2_pae_tt_mean = float(pae_tt.mean()) if pae_tt.size > 0 else float("nan")
-            af2_pae_overall_mean = float(pae.mean())
-            af2_pae_max = float(pae.max())
-
-            # Save structure
-            pdb_path = f"{output_dir}/af2_{idx}_{run_id}.pdb"
-            af.save_pdb(pdb_path)
-
-            # Save PAE matrix for downstream ipSAE computation.
-            # Array is in native AF2 ordering: [target | binder].
-            # binder_comparison/comparison/scoring.py reads this with ordering="target_binder".
-            pae_path = f"{output_dir}/af2_{idx}_{run_id}_pae.npy"
-            np.save(pae_path, pae)
-
-            # Console summary
-            print(
-                f"  Interface:   iptm={af2_iptm:.4f}  ipae={af2_ipae:.4f}  pae_bt={af2_pae_bt_mean:.4f}  pae_tb={af2_pae_tb_mean:.4f}"
-            )
-            print(
-                f"  Binder:      plddt_mean={af2_plddt_binder_mean:.4f}  plddt_min={af2_plddt_binder_min:.4f}  pae_bb={af2_pae_bb_mean:.4f}"
-            )
-            print(f"  Target:      plddt_mean={af2_plddt_target_mean:.4f}  pae_tt={af2_pae_tt_mean:.4f}")
-            print(f"  PAE overall: mean={af2_pae_overall_mean:.4f}  max={af2_pae_max:.4f}")
-            print(f"  PDB: {pdb_path}")
-            print(f"  PAE: {pae_path}")
-
-            row = {
-                "run_id": run_id,
-                "idx": idx,
-                "sequence": seq,
-                "target_pdb": target_pdb_path,
-                "binder_length": L_b,
-                "af2_iptm": af2_iptm,
-                "af2_plddt_binder_mean": af2_plddt_binder_mean,
-                "af2_plddt_binder_min": af2_plddt_binder_min,
-                "af2_plddt_binder_max": af2_plddt_binder_max,
-                "af2_plddt_target_mean": af2_plddt_target_mean,
-                "af2_pae_bt_mean": af2_pae_bt_mean,
-                "af2_pae_tb_mean": af2_pae_tb_mean,
-                "af2_ipae": af2_ipae,
-                "af2_pae_bb_mean": af2_pae_bb_mean,
-                "af2_pae_tt_mean": af2_pae_tt_mean,
-                "af2_pae_overall_mean": af2_pae_overall_mean,
-                "af2_pae_max": af2_pae_max,
-                "pdb": pdb_path,
-                "af2_pae_file": pae_path,
-            }
-            csv_writer.writerow(row)
-            csv_file.flush()
-
-    finally:
-        csv_file.close()
-
-    print(f"\n{'=' * 55}")
-    print("=== Run Complete ===")
-    print(f"Processed {len(binder_sequences)} binder(s).")
-    print(f"Results → {csv_path}")
-    print(f"PDB     → {output_dir}/af2_*_{run_id}.pdb")
-    print(f"PAE     → {output_dir}/af2_*_{run_id}_pae.npy  (for ipSAE computation)")
-    print(f"Run ID: {run_id} (for tracking this session)")
diff --git a/Evaluator/scripts/refold_protenix.py b/Evaluator/scripts/refold_protenix.py
new file mode 100644
index 0000000..a71e955
--- /dev/null
+++ b/Evaluator/scripts/refold_protenix.py
@@ -0,0 +1,426 @@
+"""Standalone batch refolder for Protenix v0.5.0.
+
+Run inside the `bindmaster_pxdesign` conda env (which ships Protenix v0.5.0
+pinned by the PXDesign installer).
+
+Emits a CSV with one row per (target, binder) pair using the top-ranked
+Protenix sample (highest ranking_score). Columns are engine-neutral
+(``iptm``, ``ptm``, ``plddt_binder_mean``, ``pae_bt_mean``, ...); the merger
+prefixes them with ``protenix_`` when aggregating across engines.
+
+pLDDT is rescaled from Protenix native 0–100 to 0–1 so downstream code can
+compare engines on the same scale.
+"""
+
+from __future__ import annotations
+
+import csv
+import json
+import os
+import tempfile
+from pathlib import Path
+
+import numpy as np
+
+
+def refold_batch(
+    binder_sequences: list[str],
+    target_sequence: str,
+    output_dir: str | os.PathLike,
+    output_csv: str | os.PathLike,
+    *,
+    num_samples: int = 5,
+    num_seeds: int = 1,
+    use_msa: bool = False,
+    n_cycle: int = 10,
+    n_step: int = 200,
+    skip_indices: set[int] | None = None,
+) -> None:
+    """Refold each binder against target using Protenix; write metrics CSV.
+
+    The top-ranked sample (by ``ranking_score``) is selected per binder; PAE is
+    saved as a .npy sidecar and the CIF path is recorded in the CSV.
+    """
+    skip_indices = skip_indices or set()
+    out_dir = Path(output_dir).resolve()
+    out_dir.mkdir(parents=True, exist_ok=True)
+    predictions_root = out_dir / "predictions"
+    predictions_root.mkdir(parents=True, exist_ok=True)
+
+    jobs = _build_job_jsons(binder_sequences, target_sequence, skip_indices=skip_indices)
+    if not jobs:
+        print("[protenix] Nothing to do (all indices skipped).")
+        return
+
+    # Protenix's inference_jsons reads a JSON file path; write to a temp file.
+    json_fd, json_path = tempfile.mkstemp(prefix="protenix_batch_", suffix=".json", dir=out_dir)
+    os.close(json_fd)
+    Path(json_path).write_text(json.dumps([j for _, j in jobs], indent=2))
+
+    seeds = tuple(range(101, 101 + num_seeds))
+
+    print(
+        f"[protenix] Running {len(jobs)} binder(s) with {num_samples} sample(s) "
+        f"per seed × {len(seeds)} seed(s)  [use_msa={use_msa}, n_cycle={n_cycle}, "
+        f"n_step={n_step}]"
+    )
+
+    # Protenix imports trigger CUDA init; import lazily.
+    from configs.configs_inference import inference_configs  # type: ignore
+    from runner.batch_inference import inference_jsons  # type: ignore
+
+    # Force Protenix to dump the full_data JSON — it contains the token-pair
+    # PAE matrix we need for DunbrackLab ipSAE computation. Default is False.
+    inference_configs["need_atom_confidence"] = True
+
+    inference_jsons(
+        json_file=json_path,
+        out_dir=str(predictions_root),
+        use_msa=use_msa,
+        seeds=seeds,
+        n_cycle=n_cycle,
+        n_step=n_step,
+        n_sample=num_samples,
+    )
+    try:
+        Path(json_path).unlink()
+    except OSError:
+        pass
+
+    _write_csv(jobs, predictions_root, out_dir, Path(output_csv), seeds=seeds)
+
+
+def _build_job_jsons(
+    binder_sequences: list[str],
+    target_sequence: str,
+    *,
+    skip_indices: set[int],
+) -> list[tuple[int, dict]]:
+    """Build one Protenix entry per (target, binder) pair.
+
+    Returns a list of (idx, dict) where idx is a 1-based binder index and dict
+    is the per-job Protenix schema entry.
+    """
+    jobs: list[tuple[int, dict]] = []
+    for i, seq in enumerate(binder_sequences, start=1):
+        if i in skip_indices:
+            continue
+        name = f"design_{i:04d}"
+        jobs.append(
+            (
+                i,
+                {
+                    "name": name,
+                    "sequences": [
+                        {"proteinChain": {"sequence": target_sequence, "count": 1}},
+                        {"proteinChain": {"sequence": seq, "count": 1}},
+                    ],
+                },
+            )
+        )
+    return jobs
+
+
+def _write_csv(
+    jobs: list[tuple[int, dict]],
+    predictions_root: Path,
+    out_dir: Path,
+    output_csv: Path,
+    *,
+    seeds: tuple[int, ...],
+) -> None:
+    """Parse Protenix outputs and write a flat metrics CSV."""
+    output_csv.parent.mkdir(parents=True, exist_ok=True)
+    fieldnames = [
+        "run_id",
+        "idx",
+        "sequence",
+        "target_sequence",
+        "binder_length",
+        "iptm",
+        "ptm",
+        "ranking_score",
+        "plddt_binder_mean",
+        "plddt_binder_min",
+        "plddt_target_mean",
+        "pae_bt_mean",
+        "pae_tb_mean",
+        "pae_bb_mean",
+        "pae_overall_mean",
+        "pae_max",
+        "cif",
+        "pdb",
+        "pae_file",
+    ]
+
+    rows: list[dict[str, str]] = []
+    for idx, job in jobs:
+        name = job["name"]
+        target_seq = job["sequences"][0]["proteinChain"]["sequence"]
+        binder_seq = job["sequences"][1]["proteinChain"]["sequence"]
+        binder_len = len(binder_seq)
+        target_len = len(target_seq)
+
+        row = {
+            "run_id": name,
+            "idx": str(idx),
+            "sequence": binder_seq,
+            "target_sequence": target_seq,
+            "binder_length": str(binder_len),
+        }
+
+        best = _load_top_sample(
+            predictions_root=predictions_root,
+            dataset_name=Path(predictions_root).stem if False else "",
+            sample_name=name,
+            seeds=seeds,
+        )
+        if best is None:
+            print(f"[protenix] No output found for {name} — row will have NaNs")
+            for col in fieldnames[5:]:
+                row.setdefault(col, "")
+            rows.append(row)
+            continue
+
+        summary = best["summary"]
+        pae = best["pae"]  # shape [N_tokens, N_tokens], [target | binder]
+        cif_path = best["cif"]
+
+        # chain_plddt is already in [0, 1]; convention: chain 0 = target, 1 = binder.
+        chain_plddt = summary.get("chain_plddt") or []
+        plddt_target_mean = float(chain_plddt[0]) if len(chain_plddt) > 0 else float("nan")
+        plddt_binder_mean = float(chain_plddt[1]) if len(chain_plddt) > 1 else float("nan")
+
+        plddt_binder_min = _binder_atom_plddt_min(best.get("full_data"), binder_chain_asym_id=1)
+
+        pae_split = _split_pae(pae, target_len, binder_len)
+        pae_file = out_dir / f"{name}_pae.npy"
+        np.save(pae_file, pae)
+
+        row.update(
+            {
+                "iptm": _fmt(summary.get("iptm")),
+                "ptm": _fmt(summary.get("ptm")),
+                "ranking_score": _fmt(summary.get("ranking_score")),
+                "plddt_binder_mean": _fmt(plddt_binder_mean),
+                "plddt_binder_min": _fmt(plddt_binder_min),
+                "plddt_target_mean": _fmt(plddt_target_mean),
+                "pae_bt_mean": _fmt(pae_split["bt_mean"]),
+                "pae_tb_mean": _fmt(pae_split["tb_mean"]),
+                "pae_bb_mean": _fmt(pae_split["bb_mean"]),
+                "pae_overall_mean": _fmt(pae_split["overall_mean"]),
+                "pae_max": _fmt(pae_split["max"]),
+                "cif": str(cif_path) if cif_path else "",
+                "pdb": "",
+                "pae_file": str(pae_file),
+            }
+        )
+        rows.append(row)
+
+    with output_csv.open("w", newline="") as fh:
+        writer = csv.DictWriter(fh, fieldnames=fieldnames)
+        writer.writeheader()
+        writer.writerows(rows)
+    print(f"[protenix] Wrote {len(rows)} row(s) → {output_csv}")
+
+
+def _load_top_sample(
+    *,
+    predictions_root: Path,
+    dataset_name: str,
+    sample_name: str,
+    seeds: tuple[int, ...],
+) -> dict | None:
+    """Locate and load the highest-ranked Protenix sample for *sample_name*.
+
+    Protenix writes sorted-by-ranking-score samples with rank 0 as the best.
+    We take sample_0 from the first seed that exists.
+
+    Protenix v0.5.0 writes to either layout depending on the caller:
+      (a) ``<predictions_root>/<sample_name>/seed_<seed>/predictions/``
+      (b) ``<predictions_root>/<dataset_name>/<sample_name>/seed_<seed>/predictions/``
+    We probe both.
+    """
+    for seed in seeds:
+        candidates = [
+            predictions_root / sample_name / f"seed_{seed}" / "predictions",
+            *predictions_root.glob(f"*/{sample_name}/seed_{seed}/predictions"),
+        ]
+        pred_dir = next((c for c in candidates if c.is_dir()), None)
+        if pred_dir is None:
+            continue
+        cif = next(pred_dir.glob(f"{sample_name}_seed_{seed}_sample_0.cif"), None)
+        summary_fp = next(
+            pred_dir.glob(f"{sample_name}_seed_{seed}_summary_confidence_sample_0.json"),
+            None,
+        )
+        if cif is None or summary_fp is None:
+            continue
+        try:
+            summary = json.loads(Path(summary_fp).read_text())
+        except (OSError, json.JSONDecodeError):
+            continue
+
+        # Protenix stores token-pair PAE in a separate *_full_data_*.json when
+        # inference_configs["need_atom_confidence"] was True at infer time.
+        full_data_fp = next(
+            pred_dir.glob(f"{sample_name}_full_data_sample_0.json"),
+            None,
+        ) or next(
+            pred_dir.glob(f"{sample_name}_seed_{seed}_full_data_sample_0.json"),
+            None,
+        )
+        full_data = None
+        if full_data_fp is not None:
+            try:
+                full_data = json.loads(Path(full_data_fp).read_text())
+            except (OSError, json.JSONDecodeError):
+                full_data = None
+        pae = _extract_pae_from_full(full_data) if full_data else _extract_pae(summary, full_data_fp)
+        return {"summary": summary, "pae": pae, "cif": cif, "full_data": full_data}
+
+    return None
+
+
+def _extract_pae_from_full(full_data: dict) -> np.ndarray:
+    arr = full_data.get("token_pair_pae") or full_data.get("pae")
+    if arr is None:
+        return np.zeros((0, 0))
+    return np.asarray(arr, dtype=np.float32)
+
+
+def _binder_atom_plddt_min(full_data: dict | None, *, binder_chain_asym_id: int) -> float:
+    """Minimum atom pLDDT within the binder chain, rescaled to [0, 1].
+
+    Protenix's atom_plddt is in 0–100 range; divide by 100 to match Boltz-2.
+    Returns NaN when full_data is unavailable.
+    """
+    if not full_data:
+        return float("nan")
+    atom_plddt = full_data.get("atom_plddt")
+    atom_to_token = full_data.get("atom_to_token_idx")
+    token_asym = full_data.get("token_asym_id")
+    if atom_plddt is None or atom_to_token is None or token_asym is None:
+        return float("nan")
+
+    atom_plddt = np.asarray(atom_plddt, dtype=np.float32)
+    atom_to_token = np.asarray(atom_to_token, dtype=np.int64)
+    token_asym = np.asarray(token_asym, dtype=np.int64)
+    if atom_plddt.size == 0 or atom_to_token.size == 0:
+        return float("nan")
+
+    binder_token_mask = token_asym == binder_chain_asym_id
+    if not np.any(binder_token_mask):
+        return float("nan")
+
+    is_binder_atom = binder_token_mask[atom_to_token]
+    binder_vals = atom_plddt[is_binder_atom]
+    if binder_vals.size == 0:
+        return float("nan")
+    return float(binder_vals.min()) / 100.0
+
+
+def _extract_pae(summary: dict, full_data_fp: Path | None = None) -> np.ndarray:
+    """Pull the PAE matrix out of a Protenix full_data_*.json if available.
+
+    Protenix v0.5.0 writes ``token_pair_pae`` into the full_data JSON, which
+    only exists when ``inference_configs["need_atom_confidence"] == True``.
+    Falls back to an empty array when the full_data file is absent.
+    """
+    if full_data_fp is not None:
+        try:
+            full = json.loads(Path(full_data_fp).read_text())
+        except (OSError, json.JSONDecodeError):
+            return np.zeros((0, 0))
+        arr = full.get("token_pair_pae") or full.get("pae")
+        if arr is not None:
+            return np.asarray(arr, dtype=np.float32)
+
+    # Fall back to whatever is in summary (older versions), else empty
+    arr = summary.get("token_pair_pae") or summary.get("pae")
+    if arr is None:
+        return np.zeros((0, 0))
+    return np.asarray(arr, dtype=np.float32)
+
+
+def _split_pae(pae: np.ndarray, target_len: int, binder_len: int) -> dict[str, float]:
+    """Summarise a PAE matrix into bt/tb/bb/overall/max scalars.
+
+    Protenix PAE follows input chain order → [target | binder].
+    """
+    if pae.ndim != 2 or pae.size == 0:
+        return {
+            "bt_mean": float("nan"),
+            "tb_mean": float("nan"),
+            "bb_mean": float("nan"),
+            "overall_mean": float("nan"),
+            "max": float("nan"),
+        }
+    total = pae.shape[0]
+    if total != target_len + binder_len:
+        # Protenix may insert extra tokens (ions, ligands); trust shape.
+        target_len = max(total - binder_len, 0)
+
+    pae_tt = pae[:target_len, :target_len]
+    pae_tb = pae[:target_len, target_len:]
+    pae_bt = pae[target_len:, :target_len]
+    pae_bb = pae[target_len:, target_len:]
+
+    def _mean(a: np.ndarray) -> float:
+        return float(a.mean()) if a.size else float("nan")
+
+    return {
+        "bt_mean": _mean(pae_bt),
+        "tb_mean": _mean(pae_tb),
+        "bb_mean": _mean(pae_bb),
+        "overall_mean": float(pae.mean()),
+        "max": float(pae.max()),
+        "tt_mean": _mean(pae_tt),
+    }
+
+
+def _fmt(v) -> str:
+    """Stringify a numeric metric; empty string for None/NaN."""
+    if v is None:
+        return ""
+    try:
+        f = float(v)
+    except (TypeError, ValueError):
+        return str(v)
+    if np.isnan(f):
+        return ""
+    return f"{f:.6g}"
+
+
+if __name__ == "__main__":
+    import argparse
+
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--sequences", required=True, help="FASTA (or plain text, one seq per line)")
+    parser.add_argument("--target-seq", required=True, help="Target amino acid sequence")
+    parser.add_argument("--output", required=True, help="Output CSV path")
+    parser.add_argument("--output-dir", default="./refold_protenix", help="Output dir for structures/PAE")
+    parser.add_argument("--num-samples", type=int, default=5)
+    parser.add_argument("--num-seeds", type=int, default=1)
+    parser.add_argument("--use-msa", action="store_true")
+    parser.add_argument("--n-cycle", type=int, default=10)
+    parser.add_argument("--n-step", type=int, default=200)
+    args = parser.parse_args()
+
+    seqs: list[str] = []
+    for line in Path(args.sequences).read_text().splitlines():
+        s = line.strip()
+        if not s or s.startswith(">"):
+            continue
+        seqs.append(s)
+    refold_batch(
+        binder_sequences=seqs,
+        target_sequence=args.target_seq,
+        output_dir=args.output_dir,
+        output_csv=args.output,
+        num_samples=args.num_samples,
+        num_seeds=args.num_seeds,
+        use_msa=args.use_msa,
+        n_cycle=args.n_cycle,
+        n_step=args.n_step,
+    )
diff --git a/PLAN_refactor_af3_rfd3.md b/PLAN_refactor_af3_rfd3.md
new file mode 100644
index 0000000..9e92c4a
--- /dev/null
+++ b/PLAN_refactor_af3_rfd3.md
@@ -0,0 +1,261 @@
+# PLAN — AF3 + Protein Hunter + RFD3 refactor (v2, platform-split)
+
+Major architecture shift: evaluation is split by platform. Design happens on x86 (RTX 3090, 24 GB), evaluation happens on DGX Spark (GH200, 128 GB unified). This lets us use AF3 only where it has enough VRAM, and lets the x86 box focus on the design-heavy workload.
+
+## Platform split
+
+| Platform | Design tools | Evaluation engines |
+|---|---|---|
+| **x86_64** (RTX 3090, 24 GB) | BindCraft, BoltzGen, Mosaic, **RFD3**, PXDesign, Proteina-Complexa, **Protein Hunter** | Boltz-2 + **Protenix** |
+| **aarch64 — DGX Spark** (GH200, 128 GB) | Same as above where supported | Boltz-2 + Protenix + **AF3 v3.0.2** |
+
+**Hybrid workflow:** run `bindmaster configure` + `run_all.sh` on x86, `rsync runs/<name>/ spark:~/BindMaster/runs/`, run `bindmaster evaluate runs/<name>` on Spark. No code change needed — just a documented recipe in the README.
+
+---
+
+## Parts (revised)
+
+| Part | Change | Platforms |
+|---|---|---|
+| **I** | Remove AF2 refolding from Evaluator | both |
+| **J** | **Protenix** refolder (replaces AF2's role as 2nd engine on x86) | both |
+| **K** | **AF3 v3.0.2** refolder (3rd engine) | aarch64 only |
+| **L** | Protein Hunter with **all 6 modalities** | x86 primary, aarch64 best-effort |
+| **M** | RFD3 replaces RFAA — **RFAA deleted**, docs keep recipe for manual re-install | both |
+| **N** | Distributed workflow docs (design on x86, evaluate on Spark) | both |
+
+---
+
+## Decisions confirmed (from user)
+
+1. **AF3 is aarch64-only.** x86 gets Boltz-2 + Protenix. No AF3 code path for x86.
+2. **RFAA deleted** (not deprecated). Installer removes the env + clone. Keep a `docs/rfaa_manual_reinstall.md` stub with the commit SHAs + patch list so the user can hand-reproduce if ever needed.
+3. **AF3 native install** on Spark (no Docker).
+4. **Protein Hunter — all 6 modalities** (protein, cyclic peptide, ligand via CCD, ligand via SMILES, DNA, RNA).
+5. **AF3 weights** — user submits Google Form today, implementation on Spark starts once weights arrive.
+
+---
+
+## Part I — Remove AF2 refolding from Evaluator
+
+**Unchanged from v1 plan.** Strip `binder-eval-af2` env, `refold_af2.py`, `af2_*` schema, AF2 report sections. Keep all three AF2 uses *outside* Evaluator:
+- BindCraft design uses ColabDesign/AF2 internally
+- PXDesign uses AF2 as an internal eval step
+- Proteina-Complexa uses AF2 cross-val optionally
+
+~25 files touched. Full checklist in my earlier audit (see Evaluator removal list in research findings).
+
+**After Part I, `agreement_count` temporarily becomes single-engine (Boltz-2 only).** Parts J and K restore multi-engine counts (and with a platform-aware denominator: 2 on x86, up to 3 on aarch64).
+
+---
+
+## Part J — Protenix as the universal 2nd engine
+
+Why Protenix: ByteDance's open-source AlphaFold3 re-implementation, already installed as part of `bindmaster_pxdesign` conda env at v0.5.0. It runs comfortably on 24 GB (PXDesign uses it for its own eval on the 3090). Same AF3-architecture metrics (pTM, ipTM, PAE, pLDDT), permissive license, commercial use fine.
+
+**Pre-implementation research (half-day):** audit how PXDesign currently invokes Protenix — find the Python entry point, output format, required config shape. The `bindmaster_pxdesign` env already has everything we need; we just need to write a thin refolder wrapper.
+
+**New files:**
+- `Evaluator/envs/binder-eval-protenix.yml` — or more likely, reuse `bindmaster_pxdesign` directly (that env already has Protenix + CUDA + deps)
+- `Evaluator/scripts/refold_protenix.py` — takes sequences + target, writes CSV with `protenix_*` columns
+- `Evaluator/binder_comparison/refolding/protenix_runner.py`
+- `Evaluator/binder_comparison/cli/refold_protenix.py` — `binder-compare refold-protenix` subcommand
+
+**Schema additions (mirrors AF3):**
+- `protenix_iptm`, `protenix_ptm`, `protenix_ranking_score`
+- `protenix_plddt_binder_mean`, `protenix_plddt_binder_min`
+- `protenix_pae_bt_mean`, `protenix_pae_tb_mean`, `protenix_pae_bb_mean`
+- `protenix_bt_ipsae`, `protenix_tb_ipsae`, `protenix_ipsae_min`
+
+**Orchestration:** `Evaluator/evaluate.sh` Step-2 becomes "Protenix refolding" (replaces old AF2 step). Skippable with `--skip-protenix`.
+
+**Risks:**
+- Protenix v0.5.0 is pinned inside PXDesign with 5 post-install patches. As long as we ride PXDesign's patched env, no issues.
+- 24 GB VRAM ceiling: for binder-target ≤450 tokens, fine; may need chunking for longer.
+- AF3-parent architecture means PAE ordering works the same way as AF3 (token order = input order → target first, binder second).
+
+**Acceptance:** `binder-compare refold-protenix --sequences seqs.fasta --target-seq SEQ -o protenix.csv` produces a CSV with all `protenix_*` columns. `bindmaster evaluate` picks it up and includes Protenix + Boltz-2 in the agreement count.
+
+---
+
+## Part K — AF3 v3.0.2 refolder (aarch64 / DGX Spark only)
+
+**Conditional install path.** `install/install_aarch.sh` gets a new `install_af3()` function. `install/install.sh` (x86) does not offer AF3 at all — no conda env, no CLI subcommand.
+
+**Env:** `binder-eval-af3` on aarch64 (Python 3.12, JAX 0.9.1 with aarch64 CUDA wheels, `dm-haiku==0.0.16`, `tokamax==0.0.11`).
+
+**Weights flow:**
+1. User submits Google Form today (https://forms.gle/svvpY4u2jsHEwWYS6)
+2. Google replies in 2-3 business days with download link
+3. User runs `bindmaster install --tool af3 --af3-weights <path>` on Spark — installer validates + stores path in `~/.bindmaster/af3_weights_path`
+4. Evaluate auto-picks up the path, else skips
+
+**New files (aarch64 path):**
+- `Evaluator/envs/binder-eval-af3.yml`
+- `Evaluator/scripts/refold_af3.py` — batch writer (one JSON per pair) → `run_alphafold.py --input_dir ... --run_data_pipeline=false --force_output_dir`
+- `Evaluator/binder_comparison/refolding/af3_runner.py`
+- `Evaluator/binder_comparison/cli/refold_af3.py`
+
+**Schema additions (`af3_*`):** symmetric with `protenix_*` above — `af3_iptm`, `af3_ptm`, `af3_ranking_score`, `af3_plddt_binder_mean` (rescaled 0-100 → 0-1), `af3_pae_*`, `af3_ipsae_min`, etc.
+
+**Restored:** `agreement_count` becomes `sum(engines_with_ipsae_min > 0.61)` where engines = `{boltz, protenix, af3}` and the denominator is platform-aware (2 on x86, 3 on Spark with weights present, 2 otherwise).
+
+**Tokamax aarch64 risk:** the only new unknown. JAX 0.9.1 + CUDA 12 has aarch64 wheels confirmed, Blackwell is officially supported in v3.0.2. If tokamax breaks on GH200, fall back to documented `XLA_FLAGS` workaround from the AF3 README.
+
+**Acceptance (on Spark):**
+- `bindmaster install --tool af3 --af3-weights /data/af3` completes + smoke tests `run_alphafold.py --help`
+- `binder-compare refold-af3 --sequences seqs.fasta --target-seq SEQ --weights-dir /data/af3 -o af3.csv` on a 3-sequence test — all `af3_*` columns populated
+- Full evaluate on PDL1 smoke target produces a report with a 3-engine agreement count
+
+---
+
+## Part L — Protein Hunter (all 6 modalities)
+
+Clone `yehlincho/Protein-Hunter@d4bd951` into `BindMaster/Protein-Hunter/`. Conda env `bindmaster_protein_hunter` (Py 3.10), upstream `setup.sh` does conda + pip.
+
+**Weight reuse:**
+- Symlink `BindMaster/LigandMPNN/model_params/` → `Protein-Hunter/LigandMPNN/model_params/`
+- Boltz-2 cache path = `~/.boltz/` (same dir Mosaic populates — no duplicate download)
+- Chai-1 weights download fresh on first use (~5 GB)
+
+**Shortcut:** wrapper script `BindMaster/bin/protein-hunter` activates the env and shells to `python $REPO/boltz_ph/design.py "$@"` (default) or `--chai` flag switches to `chai_ph/design.py`.
+
+**Configurator pages (new):**
+- Tool toggle: Protein Hunter yes/no
+- Backbone: Boltz-2 (default) / Chai-1
+- **Modality (full 6-way choice):**
+  - Protein binder
+  - Cyclic peptide (`--cyclic`)
+  - Small-molecule via CCD (`--ligand_ccd`)
+  - Small-molecule via SMILES (`--ligand_smiles`)
+  - DNA binder (`--nucleic_seq --nucleic_type dna`)
+  - RNA binder (`--nucleic_seq --nucleic_type rna`)
+- Target sequence (extracted from PDB/CIF by configurator; multi-chain via `:`)
+- Hotspots → `--contact_residues`
+- Binder length range (`--min_protein_length`/`--max_protein_length`)
+- Design count + cycles, ipTM threshold, %X
+
+**Templates:** one `run_protein_hunter.sh.template` per modality (6 templates, or one with modality switch).
+
+**Evaluator extractor:** `Evaluator/binder_comparison/extractors/protein_hunter.py`
+- Default: parse `summary_high_iptm.csv` (high-quality filter)
+- `--all-protein-hunter-designs` flag → `summary_all_runs.csv`
+- Emit tool-colored entries in report (new color assignment in `_TOOL_COLOURS_NGL`)
+
+**aarch64 best-effort:** pyrosetta has no aarch64 wheel; chai-lab fork untested. Initial aarch64 approach: clone + try install with `install_aarch.sh`, warn if pyrosetta step fails, document as known limitation. If critical, patch to replace pyrosetta with a no-op for analysis steps.
+
+**License gotcha:** upstream root `LICENSE` file is missing despite README claiming MIT. Open an upstream issue/PR to add it before releasing BindMaster with this integration.
+
+**Acceptance:**
+- `bindmaster install --tool protein-hunter` succeeds on x86
+- Configurator generates 6 different run scripts (one per modality) correctly
+- Smoke test: `--num_designs 3 --num_cycles 2` on PDL1 target finishes, writes `summary_high_iptm.csv`
+- Evaluator picks up PH outputs and includes them in the report
+
+---
+
+## Part M — RFD3 replaces RFAA (hard delete)
+
+**Delete cleanly:**
+- Remove `install_rfaa()` + uninstall branch from both installers
+- Remove `rf_diffusion_all_atom/` + `LigandMPNN/` clones on next install (`bindmaster install --uninstall --tool rfaa` still works for existing users)
+- Remove configurator's RFAA page
+- Remove `Evaluator/binder_comparison/extractors/rfaa.py`
+- Remove all RFAA references from `run_all.sh` template generation
+
+**Add `docs/rfaa_manual_reinstall.md`** — pin commit SHAs, patch list (`idealize_backbone.py`, `residue_constants.py np.int → np.int64`), clone + conda env commands. For users who have old `runs/` and want to reproduce without BindMaster orchestration.
+
+**Install RFD3:**
+- PyPI install: `pip install "rc-foundry[rfd3]"` (pinned `v0.1.9`)
+- Conda env `bindmaster_rfd3` (Py 3.12, PyTorch ≥2.2 + CUDA)
+- `foundry install rfd3 --checkpoint-dir BindMaster/weights/foundry` downloads weights
+- **aarch64: works!** No DGL dependency, so the Grace-Hopper blocker is gone. Enable in `install_aarch.sh` for the first time.
+
+**Shortcut:** `BindMaster/bin/rfd3` → activates env + shells to `rfd3 "$@"` (Hydra entry point).
+
+**Configurator page (new, replaces old RFAA page):**
+- Input PDB/CIF
+- Contig string (Hydra `InputSpecification`)
+- `select_hotspots` (dict form)
+- `ligand` (CCD or SMILES) when ligand binder
+- `partial_t` for motif scaffolding
+- `n_batches`, `diffusion_batch_size`, `inference_sampler.num_timesteps`
+
+**No post-install patches needed** (AtomWorks replaces openfold structure normalization).
+
+**Evaluator extractor:** new `extractors/rfd3.py`, parse Hydra output structure. LigandMPNN sequence design is optional (RFD3 includes a built-in but the docs recommend MPNN refinement — inherit via `foundry`'s `models/mpnn`).
+
+**Acceptance:**
+- `bindmaster install --tool rfd3` on x86 and aarch64 (**first time**)
+- `rfd3 design out_dir=/tmp/smoke inputs=examples/smoke_ppi.yaml n_batches=1 diffusion_batch_size=2 inference_sampler.num_timesteps=20` completes in <60 s
+- `bindmaster install --uninstall --tool rfaa` removes the legacy env
+- Configurator + run script work for both protein binder and ligand binder modalities
+
+---
+
+## Part N — Distributed workflow docs
+
+New section in `README.md` + `docs/distributed_workflow.md`:
+
+```
+# Typical workflow (x86 design + Spark evaluate)
+
+# On x86 dev box:
+bindmaster configure  # target.pdb, enabled tools, no AF3
+bash runs/<name>/run_all.sh
+rsync -av runs/<name>/ spark:~/BindMaster/runs/<name>/
+
+# On DGX Spark:
+bindmaster evaluate runs/<name>  # Boltz-2 + Protenix + AF3 (3-engine agreement)
+rsync -av spark:~/BindMaster/runs/<name>/report/ ./runs/<name>/report/
+```
+
+Optional convenience scripts:
+- `scripts/push_to_spark.sh <run_name>` — one-line rsync + ssh command
+- `scripts/pull_report.sh <run_name>` — reverse
+
+No code change to evaluate itself — it already works wherever the envs are installed.
+
+---
+
+## Implementation order — max parallelism with AF3 weight wait
+
+```
+Day 1-2   Part I (AF2 removal)              on master, single PR
+Day 2-3   Part J (Protenix refolder)        depends on I
+Day 3-5   Part L (Protein Hunter, x86)      independent PR
+Day 5-7   Part M (RFD3, delete RFAA)        independent PR — touches both installers
+Day 7-8   Part N (workflow docs)            after I, J, L, M land
+          ↓ PARALLEL: user submits AF3 Google Form on day 1
+          ↓ Google replies days 3-4
+Day 8-10  Part K (AF3 on Spark)             depends on weights arrival + J
+Day 10-11 End-to-end integration test       on both platforms
+Day 11    CHANGELOG → v0.8.0, tag, merge
+```
+
+## Branch strategy
+
+- Parent feature branch: `refactor/af3-rfd3-ph`
+- Sub-branches for PRs: `part/I-remove-af2`, `part/J-protenix`, `part/K-af3`, `part/L-protein-hunter`, `part/M-rfd3`
+- Each sub-PR green CI (ruff + shellcheck + docker smoke) before merging up
+
+## Quick smoke tests per part
+
+| Part | Smoke test |
+|---|---|
+| I | `bindmaster evaluate runs/smoke_pdl1 --metric ipsae_min --top 5` → Boltz-2-only report, no AF2 cols |
+| J | `binder-compare refold-protenix --sequences 3_seqs.fasta --target-seq SEQ -o protenix.csv` — all `protenix_*` cols |
+| K | (on Spark, after weights) `binder-compare refold-af3 --sequences 3_seqs.fasta --target-seq SEQ --weights-dir $BINDMASTER_AF3_WEIGHTS -o af3.csv` |
+| L | Per modality: `conda run -n bindmaster_protein_hunter python Protein-Hunter/boltz_ph/design.py --num_designs 2 --num_cycles 2 --protein_seqs <PDL1> ...` |
+| M | `rfd3 design out_dir=/tmp/rfd3_smoke inputs=examples/smoke_ppi.yaml n_batches=1 diffusion_batch_size=2 inference_sampler.num_timesteps=20` |
+
+## End-to-end integration test (Part N1)
+
+1. On x86: clean install, then `bindmaster configure` with PDL1, enabling BindCraft, BoltzGen, Mosaic, PXDesign, Proteina-Complexa, Protein Hunter (all 6 modalities across 6 runs), RFD3. Skip BindCraft AF2 eval for speed.
+2. Run `bash runs/PDL1_all/run_all.sh` — shortened to 3 designs per tool
+3. Extract + refold with Boltz-2 + Protenix on x86 → check `agreement_count ∈ {0, 1, 2}`, no `af2_*` columns
+4. rsync `runs/PDL1_all/` to Spark
+5. On Spark: re-run `bindmaster evaluate runs/PDL1_all` — `agreement_count ∈ {0, 1, 2, 3}`, AF3 columns present
+6. Open report HTML, verify all tool colors + AF3 ranking + 3D viewer.
+
+Done when both x86 and Spark reports render and tool list matches enabled set.
diff --git a/configurator/configurator.py b/configurator/configurator.py
index 43ec3c0..a25e475 100644
--- a/configurator/configurator.py
+++ b/configurator/configurator.py
@@ -200,10 +200,7 @@ def _env_exists(name: str) -> bool:
         "bindcraft": _env_exists("BindCraft"),
         "boltzgen": _env_exists("BoltzGen"),
         "mosaic": (MOSAIC_VENV / "bin" / "python").exists(),
-        "evaluator": (
-            (EVALUATOR_DIR / "evaluate.sh").exists()
-            and (_env_exists("binder-eval-boltz2") or _env_exists("binder-eval-af2"))
-        ),
+        "evaluator": ((EVALUATOR_DIR / "evaluate.sh").exists() and _env_exists("binder-eval")),
         "rfaa": _env_exists("bindmaster_rfaa"),
         "pxdesign_local": _env_exists("bindmaster_pxdesign"),
         "proteina_complexa": (PROTEINA_COMPLEXA_VENV / "bin" / "python").exists(),
@@ -1988,7 +1985,7 @@ def run_pipeline(cfg: dict, tools_enabled: dict):
             failed.append("Proteina-Complexa")
 
     if tools_enabled.get("evaluator"):
-        print_step("Running Evaluator  (Boltz2 + AF2 refolding — this may take a while)")
+        print_step("Running Evaluator  (Boltz-2 refolding + ranked report — this may take a while)")
         rc = subprocess.run(["bash", str(run_dir / "run_evaluate.sh")]).returncode
         if rc == 0:
             print_ok("Evaluator completed")
@@ -2203,7 +2200,7 @@ def _tag(key):
     print(f"  {BOLD}Proteina-Complexa{RESET} [{_tag('proteina_complexa')}]")
     use_proteina_complexa = ask_yn("  Enable Proteina-Complexa (NVIDIA flow matching)?", default=False)
     print(f"  {BOLD}Evaluator{RESET} [{_tag('evaluator')}]")
-    use_evaluator = ask_yn("  Enable cross-evaluation (Boltz2 + AF2 refolding)?", default=False)
+    use_evaluator = ask_yn("  Enable cross-evaluation (Boltz-2 refolding + ranked report)?", default=False)
 
     tools_enabled = {
         "mosaic": use_mosaic,
@@ -2559,7 +2556,7 @@ def _tag(key):
             f"max_designs={cfg.get('complexa_n_designs')}"
         )
     if use_evaluator:
-        print(f"  {CYAN}Evaluator{RESET}:     Boltz2 + AF2 refolding → evaluate report")
+        print(f"  {CYAN}Evaluator{RESET}:     Boltz-2 refolding → ranked report")
 
     print_tree(run_dir, tools_enabled, cfg)
 
@@ -2579,7 +2576,7 @@ def _tag(key):
     if use_boltzgen:
         print_warn("BoltzGen downloads ~6 GB of weights on first run.")
     if use_evaluator:
-        print_warn("Evaluator runs Boltz2 + AF2 refolding (GPU recommended, ~30 min per design).")
+        print_warn("Evaluator runs Boltz-2 refolding (GPU recommended, ~30 min per design).")
 
     if ask_yn("Run the pipeline now?", default=True):
         run_pipeline(cfg, tools_enabled)
diff --git a/docs/rfaa_manual_reinstall.md b/docs/rfaa_manual_reinstall.md
new file mode 100644
index 0000000..b2ddaa5
--- /dev/null
+++ b/docs/rfaa_manual_reinstall.md
@@ -0,0 +1,115 @@
+# RFAA — legacy maintenance notes
+
+**Status (2026-04):** RFAA (`baker-laboratory/rf_diffusion_all_atom`) is
+**deprecated in BindMaster**. The Baker lab stopped commit activity in March
+2024 and superseded the project with **RFdiffusion3 / foundry**
+(`RosettaCommons/foundry`) in December 2025. BindMaster's default all-atom
+tool is now **RFD3** — install it with `bindmaster install --tool rfd3`.
+
+RFAA remains installable for backwards-compatibility with existing `runs/`
+directories and for users who need the original RFAA weights. The interactive
+menu no longer offers it; opt in explicitly:
+
+```bash
+bindmaster install --tool rfaa            # x86_64 only
+bindmaster install --uninstall --tool rfaa   # when you're ready to remove it
+```
+
+---
+
+## Why RFAA is deprecated
+
+- **Upstream dormant.** HEAD of `baker-laboratory/rf_diffusion_all_atom`
+  is still the 2024-03-13 "LICENSE" merge. 2 years of unresolved issues
+  (#21 TRP side-chain bug, #26 run-outside-install-dir, #32 GPU parsing, …).
+- **aarch64 blocker.** DGL has no CUDA-enabled aarch64 wheels, so the
+  SE3-Transformer path doesn't work on DGX Spark / Grace-Hopper.
+- **RFD3 supersedes it.** Atom-level diffusion replaces the graph-level
+  stack, handles ligand + nucleic-acid binders natively, BSD-3-Clause
+  license, no DGL.
+
+## Manually reproducing an existing RFAA run
+
+1. Check the repo state that installed your run:
+   - `rf_diffusion_all_atom` HEAD: `f913a19e16f30858ce7a724fe028475b1871319c`
+     (2024-03-13 — this IS upstream HEAD)
+   - `LigandMPNN` HEAD: `26ec57ac976ade5379920dbd43c7f97a91cf82de`
+     (2025-02-06)
+
+2. Clone + pin:
+
+   ```bash
+   git clone https://github.com/baker-laboratory/rf_diffusion_all_atom.git
+   cd rf_diffusion_all_atom && git submodule update --init --recursive
+   git clone https://github.com/dauparas/LigandMPNN.git
+   (cd LigandMPNN && git checkout 26ec57a)
+   ```
+
+3. Create the conda env (x86_64 only; aarch64 won't work — DGL blocker):
+
+   ```bash
+   conda create -n bindmaster_rfaa -y python=3.11 \
+       "pytorch>=2.2" "pytorch-cuda=12.4" gcc_linux-64 gxx_linux-64 \
+       -c pytorch -c nvidia -c conda-forge
+   conda run -n bindmaster_rfaa pip install -q hydra-core omegaconf icecream \
+       scipy "numpy<2" pandas tqdm fire assertpy deepdiff opt-einsum e3nn \
+       ml_collections dm-tree "dgl==1.1.3+cu121" \
+       -f https://data.dgl.ai/wheels/cu121/repo.html \
+       "torchdata==0.7.1" prody openbabel-wheel
+   ```
+
+4. Post-install patches (required by the 2024-03 codebase on modern numpy /
+   PyRosetta stacks — see `install/install.sh:install_rfaa()` for the
+   current implementation):
+
+   - `rf_diffusion_all_atom/idealize_backbone.py` — the upstream asserts
+     exactly one ligand; patch to accept zero (protein-only) designs.
+   - `LigandMPNN/openfold/np/residue_constants.py` — replace `np.int`
+     (removed in numpy 2.x) with `np.int64`.
+
+5. Download weights:
+
+   ```bash
+   wget -O rf_diffusion_all_atom/weights/RFDiffusionAA_paper_weights.pt \
+       http://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFDiffusionAA_paper_weights.pt
+   (cd LigandMPNN && bash get_model_params.sh ./model_params)
+   ```
+
+6. Use via PYTHONPATH (RFAA is not pip-installable):
+
+   ```bash
+   export PYTHONPATH="$(pwd)/rf_diffusion_all_atom:$(pwd)/LigandMPNN${PYTHONPATH:+:$PYTHONPATH}"
+   conda activate bindmaster_rfaa
+   # Then follow the RFAA README for `rf_diffusion_all_atom/run_inference.py` flags
+   ```
+
+## Useful unmerged upstream PRs
+
+These were known, small, and addressed real bugs — not merged by the Baker
+lab. Consider cherry-picking them if you hit the symptoms:
+
+- **#21** (2024-07, unmerged) — TRP side-chain N/C atom ordering fix in
+  `chemical.py`. Affects motif-scaffolded ligand binders with tryptophan.
+- **#26** (2024-09, unmerged) — allow running outside the install dir
+  (checkpoint + output path search).
+- **#37** (2025-05, merged post-our-pin) — `ContigMap.inpaint_seq` fix for
+  motif-scaffolded designs.
+
+## Migration to RFD3
+
+Configs are **not compatible** — RFAA uses Hydra YAMLs with
+`contigmap.contigs` / `potentials`; RFD3 uses an
+`InputSpecification` schema with top-level `contig`, `ligand`,
+`select_hotspots`, `select_fixed_atoms`, `partial_t`. BindMaster's
+configurator regenerates the config when you switch tools.
+
+Contig syntax is broadly compatible — the old `A40-60/0/70` idioms port
+with minor edits, and RFD3 adds an `unindex` block for floating motifs.
+
+AtomWorks (RFD3's framework) handles structure normalisation internally, so
+the RFAA `idealize_backbone.py` / `residue_constants.py np.int → np.int64`
+post-install patches are **no longer needed** for RFD3.
+
+See `install/install.sh:install_rfd3()` and the
+[foundry docs](https://rosettacommons.github.io/foundry/) for the
+current RFD3 integration.
diff --git a/install/install.sh b/install/install.sh
index 35b4bda..5958555 100755
--- a/install/install.sh
+++ b/install/install.sh
@@ -1,9 +1,10 @@
 #!/bin/bash
 # BindMaster Installer
-# Installs BindCraft, BoltzGen, and/or Mosaic protein design tools.
+# Installs BindCraft, BoltzGen, Mosaic, RFAA, PXDesign, Proteina-Complexa,
+# Protein-Hunter, and/or the Evaluator.
 #
 # Usage:
-#   bash install/install.sh [--tool bindcraft|boltzgen|mosaic|evaluator|rfaa|pxdesign|proteina-complexa|all] [--cuda VERSION] [--skip-examples] [--yes]
+#   bash install/install.sh [--tool bindcraft|boltzgen|mosaic|evaluator|rfaa|pxdesign|proteina-complexa|protein-hunter|all] [--cuda VERSION] [--skip-examples] [--yes]
 #   bindmaster install [same options]
 #
 # With no --tool flag, an interactive menu lets you choose which tools to install.
@@ -35,6 +36,17 @@ PXDESIGN_DIR="${BINDMASTER_DIR}/PXDesign"
 PROTEINA_COMPLEXA_REPO="https://github.com/NVIDIA-Digital-Bio/proteina-complexa.git"
 PROTEINA_COMPLEXA_COMMIT="HEAD"
 PROTEINA_COMPLEXA_DIR="${BINDMASTER_DIR}/Proteina-Complexa"
+PROTEIN_HUNTER_REPO="https://github.com/yehlincho/Protein-Hunter.git"
+PROTEIN_HUNTER_COMMIT="d4bd9515882c2aa81e97f3d3bf7f42247a9fe80c"
+PROTEIN_HUNTER_DIR="${BINDMASTER_DIR}/Protein-Hunter"
+# RFD3 / Foundry (Baker lab's RFdiffusion3; replaces RFAA).
+# Installed from PyPI as rc-foundry — no clone needed. Variables kept for
+# documentation + uninstall (FOUNDRY_DIR is cleaned on uninstall if present).
+# shellcheck disable=SC2034
+FOUNDRY_REPO="https://github.com/RosettaCommons/foundry.git"
+FOUNDRY_COMMIT="v0.1.9"
+FOUNDRY_DIR="${BINDMASTER_DIR}/Foundry"
+FOUNDRY_WEIGHTS_DIR="${BINDMASTER_DIR}/weights/foundry"
 
 CONDA_CMD=""          # set by detect_conda: full path to mamba (preferred) or conda
 ARCH="$(uname -m)"   # x86_64 or aarch64 (e.g. DGX Spark / Grace-Hopper)
@@ -60,9 +72,11 @@ DO_BINDCRAFT=false
 DO_BOLTZGEN=false
 DO_MOSAIC=false
 DO_EVALUATOR=false
-DO_RFAA=false
+DO_RFAA=false           # legacy — opt-in via --tool rfaa; RFD3 is the default all-atom tool
 DO_PXDESIGN=false
 DO_PROTEINA_COMPLEXA=false
+DO_PROTEIN_HUNTER=false
+DO_RFD3=false
 
 # ─── Argument Parsing ─────────────────────────────────────────────────────────
 while [[ $# -gt 0 ]]; do
@@ -71,7 +85,9 @@ while [[ $# -gt 0 ]]; do
             TOOL_SPECIFIED=true
             case "${2,,}" in
                 all)
-                    DO_BINDCRAFT=true; DO_BOLTZGEN=true; DO_MOSAIC=true; DO_EVALUATOR=true; DO_RFAA=true; DO_PXDESIGN=true; DO_PROTEINA_COMPLEXA=true ;;
+                    # "all" installs current-generation tools. RFAA is legacy
+                    # (replaced by RFD3); opt in explicitly with --tool rfaa.
+                    DO_BINDCRAFT=true; DO_BOLTZGEN=true; DO_MOSAIC=true; DO_EVALUATOR=true; DO_PXDESIGN=true; DO_PROTEINA_COMPLEXA=true; DO_PROTEIN_HUNTER=true; DO_RFD3=true ;;
                 bindcraft)
                     DO_BINDCRAFT=true ;;
                 boltzgen)
@@ -82,12 +98,16 @@ while [[ $# -gt 0 ]]; do
                     DO_EVALUATOR=true ;;
                 rfaa)
                     DO_RFAA=true ;;
+                rfd3|foundry)
+                    DO_RFD3=true ;;
                 pxdesign)
                     DO_PXDESIGN=true ;;
                 proteina-complexa|proteina_complexa|complexa)
                     DO_PROTEINA_COMPLEXA=true ;;
+                protein-hunter|protein_hunter|phunter)
+                    DO_PROTEIN_HUNTER=true ;;
                 *)
-                    echo -e "${RED}Invalid --tool value: $2. Must be one of: all, bindcraft, boltzgen, mosaic, evaluator, rfaa, pxdesign, proteina-complexa${RESET}"
+                    echo -e "${RED}Invalid --tool value: $2. Must be one of: all, bindcraft, boltzgen, mosaic, evaluator, rfaa, rfd3, pxdesign, proteina-complexa, protein-hunter${RESET}"
                     exit 1
                     ;;
             esac
@@ -473,6 +493,14 @@ is_pxdesign_installed() {
     [[ -d "${PXDESIGN_DIR}" ]] && env_exists bindmaster_pxdesign
 }
 
+is_protein_hunter_installed() {
+    [[ -d "${PROTEIN_HUNTER_DIR}" ]] && env_exists bindmaster_protein_hunter
+}
+
+is_rfd3_installed() {
+    env_exists bindmaster_rfd3
+}
+
 is_proteina_complexa_installed() {
     [[ -d "${PROTEINA_COMPLEXA_DIR}" ]] && [[ -d "${PROTEINA_COMPLEXA_DIR}/.venv" ]]
 }
@@ -501,45 +529,50 @@ print_tool_status() {
 # DO_BINDCRAFT / DO_BOLTZGEN / DO_MOSAIC based on user choices.
 
 select_tools_interactive() {
-    # Default: all selected
+    # Default: current-generation tools selected. RFAA is legacy and not shown
+    # here; opt in with `--tool rfaa` on the CLI. RFD3 replaces it in the menu.
     local sel_bc=true
     local sel_bg=true
     local sel_mo=true
     local sel_ev=true
-    local sel_rfaa=false
+    local sel_rfd3=true
     local sel_pxd=false
     local sel_pc=false
+    local sel_ph=false
 
-    local tools=("BindCraft" "BoltzGen" "Mosaic" "Evaluator" "RFAA" "PXDesign" "Proteina-Complexa")
+    local tools=("BindCraft" "BoltzGen" "Mosaic" "Evaluator" "RFD3" "PXDesign" "Proteina-Complexa" "Protein-Hunter")
     local descs=(
         "Binder design via AlphaFold2 (conda, Python 3.10)"
         "Structure generation with Boltz-1 (conda, Python 3.12, ~6 GB download)"
         "JAX-based protein design with Marimo notebooks (uv venv)"
-        "Evaluate binders: refold with Boltz-2 + AF2, ranked report (requires Mosaic)"
-        "All-atom diffusion + LigandMPNN for ligand binder design (conda)"
+        "Evaluate binders: refold with Boltz-2 (+ Protenix on x86, AF3 on aarch64), ranked report (requires Mosaic)"
+        "RFD3 / foundry — all-atom diffusion for protein + ligand + NA binders (conda, replaces RFAA)"
         "Protenix-based de novo binder design (conda)"
         "NVIDIA flow matching + test-time compute binder design (uv venv)"
+        "Protein-Hunter — Boltz/Chai hallucination: protein/cyclic/ligand/DNA/RNA binders (conda)"
     )
 
     # Check current install state once (avoid repeated conda calls in the loop)
-    local inst_bc inst_bg inst_mo inst_ev inst_rfaa inst_pxd inst_pc
+    local inst_bc inst_bg inst_mo inst_ev inst_rfd3 inst_pxd inst_pc inst_ph
     is_bindcraft_installed && inst_bc="${GREEN}installed${RESET}" || inst_bc="${YELLOW}not installed${RESET}"
     is_boltzgen_installed  && inst_bg="${GREEN}installed${RESET}" || inst_bg="${YELLOW}not installed${RESET}"
     is_mosaic_installed    && inst_mo="${GREEN}installed${RESET}" || inst_mo="${YELLOW}not installed${RESET}"
     is_evaluator_installed && inst_ev="${GREEN}installed${RESET}" || inst_ev="${YELLOW}not installed${RESET}"
-    is_rfaa_installed      && inst_rfaa="${GREEN}installed${RESET}" || inst_rfaa="${YELLOW}not installed${RESET}"
+    is_rfd3_installed      && inst_rfd3="${GREEN}installed${RESET}" || inst_rfd3="${YELLOW}not installed${RESET}"
     is_pxdesign_installed  && inst_pxd="${GREEN}installed${RESET}" || inst_pxd="${YELLOW}not installed${RESET}"
     is_proteina_complexa_installed && inst_pc="${GREEN}installed${RESET}" || inst_pc="${YELLOW}not installed${RESET}"
-    local inst_states=("$inst_bc" "$inst_bg" "$inst_mo" "$inst_ev" "$inst_rfaa" "$inst_pxd" "$inst_pc")
+    is_protein_hunter_installed    && inst_ph="${GREEN}installed${RESET}" || inst_ph="${YELLOW}not installed${RESET}"
+    local inst_states=("$inst_bc" "$inst_bg" "$inst_mo" "$inst_ev" "$inst_rfd3" "$inst_pxd" "$inst_pc" "$inst_ph")
 
     # Helper: print current state
     _print_menu() {
         echo ""
         echo -e "${BOLD}${CYAN}  Select tools to install${RESET}"
         echo -e "  Type a number to toggle selection, then press Enter when done."
+        echo -e "  ${YELLOW}(note: RFAA is legacy — use ${CYAN}--tool rfaa${RESET}${YELLOW} on the CLI to opt in)${RESET}"
         echo ""
-        local states=("$sel_bc" "$sel_bg" "$sel_mo" "$sel_ev" "$sel_rfaa" "$sel_pxd" "$sel_pc")
-        for i in 0 1 2 3 4 5 6; do
+        local states=("$sel_bc" "$sel_bg" "$sel_mo" "$sel_ev" "$sel_rfd3" "$sel_pxd" "$sel_pc" "$sel_ph")
+        for i in 0 1 2 3 4 5 6 7; do
             local box
             if [[ "${states[$i]}" == true ]]; then
                 box="${GREEN}[x]${RESET}"
@@ -563,20 +596,21 @@ select_tools_interactive() {
             2) [[ "$sel_bg" == true ]] && sel_bg=false || sel_bg=true ;;
             3) [[ "$sel_mo" == true ]] && sel_mo=false || sel_mo=true ;;
             4) [[ "$sel_ev" == true ]] && sel_ev=false || sel_ev=true ;;
-            5) [[ "$sel_rfaa" == true ]] && sel_rfaa=false || sel_rfaa=true ;;
+            5) [[ "$sel_rfd3" == true ]] && sel_rfd3=false || sel_rfd3=true ;;
             6) [[ "$sel_pxd" == true ]] && sel_pxd=false || sel_pxd=true ;;
             7) [[ "$sel_pc" == true ]] && sel_pc=false || sel_pc=true ;;
-            a) sel_bc=true;  sel_bg=true;  sel_mo=true;  sel_ev=true;  sel_rfaa=true;  sel_pxd=true;  sel_pc=true  ;;
-            n) sel_bc=false; sel_bg=false; sel_mo=false; sel_ev=false; sel_rfaa=false; sel_pxd=false; sel_pc=false ;;
+            8) [[ "$sel_ph" == true ]] && sel_ph=false || sel_ph=true ;;
+            a) sel_bc=true;  sel_bg=true;  sel_mo=true;  sel_ev=true;  sel_rfd3=true;  sel_pxd=true;  sel_pc=true;  sel_ph=true  ;;
+            n) sel_bc=false; sel_bg=false; sel_mo=false; sel_ev=false; sel_rfd3=false; sel_pxd=false; sel_pc=false; sel_ph=false ;;
             "")
                 # Confirm: at least one must be selected
-                if [[ "$sel_bc" == false && "$sel_bg" == false && "$sel_mo" == false && "$sel_ev" == false && "$sel_rfaa" == false && "$sel_pxd" == false && "$sel_pc" == false ]]; then
+                if [[ "$sel_bc" == false && "$sel_bg" == false && "$sel_mo" == false && "$sel_ev" == false && "$sel_rfd3" == false && "$sel_pxd" == false && "$sel_pc" == false && "$sel_ph" == false ]]; then
                     echo -e "  ${RED}No tools selected. Select at least one.${RESET}"
                     continue
                 fi
                 break
                 ;;
-            *) echo -e "  ${RED}Invalid input. Enter 1–7, a, n, or press Enter.${RESET}" ;;
+            *) echo -e "  ${RED}Invalid input. Enter 1–8, a, n, or press Enter.${RESET}" ;;
         esac
     done
 
@@ -584,9 +618,10 @@ select_tools_interactive() {
     DO_BOLTZGEN="$sel_bg"
     DO_MOSAIC="$sel_mo"
     DO_EVALUATOR="$sel_ev"
-    DO_RFAA="$sel_rfaa"
+    DO_RFD3="$sel_rfd3"
     DO_PXDESIGN="$sel_pxd"
     DO_PROTEINA_COMPLEXA="$sel_pc"
+    DO_PROTEIN_HUNTER="$sel_ph"
 
     echo ""
     echo -e "  ${BOLD}Installing:${RESET}"
@@ -594,9 +629,11 @@ select_tools_interactive() {
     [[ "$DO_BOLTZGEN"  == true ]] && echo -e "    ${GREEN}✓${RESET} BoltzGen"
     [[ "$DO_MOSAIC"    == true ]] && echo -e "    ${GREEN}✓${RESET} Mosaic"
     [[ "$DO_EVALUATOR" == true ]] && echo -e "    ${GREEN}✓${RESET} Evaluator"
-    [[ "$DO_RFAA"      == true ]] && echo -e "    ${GREEN}✓${RESET} RFAA"
+    [[ "$DO_RFAA"      == true ]] && echo -e "    ${YELLOW}✓ RFAA (legacy)${RESET}"
+    [[ "$DO_RFD3"      == true ]] && echo -e "    ${GREEN}✓${RESET} RFD3"
     [[ "$DO_PXDESIGN"  == true ]] && echo -e "    ${GREEN}✓${RESET} PXDesign"
     [[ "$DO_PROTEINA_COMPLEXA" == true ]] && echo -e "    ${GREEN}✓${RESET} Proteina-Complexa"
+    [[ "$DO_PROTEIN_HUNTER" == true ]] && echo -e "    ${GREEN}✓${RESET} Protein-Hunter"
     echo ""
 
     confirm "Proceed with installation?" || { echo "Aborted."; exit 0; }
@@ -1083,23 +1120,10 @@ install_evaluator() {
         "${CONDA_CMD}" run -n binder-eval pip install -q -e "${EVALUATOR_DIR}[report]" \
         || { print_fail "Failed to install binder-compare into binder-eval"; return 1; }
 
-    # binder-eval-af2 conda env (AF2 refolding via ColabDesign)
-    print_step "Creating binder-eval-af2 conda environment (Python 3.10)"
-    if env_exists binder-eval-af2; then
-        print_warn "Conda environment 'binder-eval-af2' already exists — skipping creation."
-    else
-        run_logged "Creating binder-eval-af2 conda env" \
-            "${CONDA_CMD}" env create -f "${EVALUATOR_DIR}/envs/binder-eval-af2.yml" -y \
-            || { print_fail "Failed to create binder-eval-af2 conda env"; return 1; }
-    fi
-    run_logged "Installing ColabDesign + binder-compare into binder-eval-af2" \
-        "${CONDA_CMD}" run -n binder-eval-af2 pip install -q colabdesign==1.1.1 -e "${EVALUATOR_DIR}[af2]" \
-        || { print_fail "Failed to install packages into binder-eval-af2"; return 1; }
-
-    # ColabDesign pulls CPU-only JAX by default; install CUDA plugin
-    run_logged "Installing JAX CUDA plugin into binder-eval-af2" \
-        "${CONDA_CMD}" run -n binder-eval-af2 pip install -q "jax[cuda12]" \
-        || { print_fail "Failed to install JAX CUDA plugin"; return 1; }
+    # (AF2 refolding was removed in the AF3/Protenix refactor; the
+    #  binder-eval-af2 env is no longer created. Protenix refolding will
+    #  reuse the existing bindmaster_pxdesign env; AF3 refolding lands on
+    #  aarch64 only via install_aarch.sh.)
 
     # Smoke test
     smoke_test "binder-compare --help" \
@@ -1112,7 +1136,6 @@ install_evaluator() {
     print_ok "Shortcut installed at ${SHORTCUTS_DIR}/evaluate"
 
     print_ok "Evaluator installation complete"
-    print_ok "  AF2 weights (~4 GB) must be at \$AF2_DATA_DIR — see Evaluator/docs/pipeline_reference.md"
 }
 
 _write_evaluator_shortcut() {
@@ -1133,7 +1156,12 @@ EOF
 # ─── RFAA + LigandMPNN ──────────────────────────────────────────────────────
 
 install_rfaa() {
-    print_step "Installing RFDiffusionAA + LigandMPNN"
+    print_step "Installing RFDiffusionAA + LigandMPNN (legacy)"
+    print_warn "RFAA is LEGACY in this BindMaster release."
+    print_warn "  Upstream has been dormant since 2024-03; Baker lab moved to"
+    print_warn "  RFdiffusion3 (now available via --tool rfd3)."
+    print_warn "  RFAA is kept for reproducibility of existing runs; see"
+    print_warn "  docs/rfaa_manual_reinstall.md for long-term maintenance notes."
 
     # Clone RFAA
     if [[ -d "${RFAA_DIR}" ]]; then
@@ -1466,6 +1494,14 @@ LNEOF
         "${CONDA_CMD}" run -n bindmaster_pxdesign python -c "import torch; print('PXDesign env OK')" \
         || return 1
 
+    # Install binder-compare into the PXDesign env so Protenix refolding
+    # (Part J) can run via `conda run -n bindmaster_pxdesign binder-compare refold-protenix`.
+    if [[ -d "${EVALUATOR_DIR}" ]]; then
+        run_logged "Installing binder-compare into bindmaster_pxdesign (for Protenix refold)" \
+            "${CONDA_CMD}" run -n bindmaster_pxdesign pip install -q -e "${EVALUATOR_DIR}[report]" \
+            || print_warn "binder-compare install into bindmaster_pxdesign failed — Protenix refolding will be unavailable"
+    fi
+
     # Shortcut
     mkdir -p "${SHORTCUTS_DIR}"
     cat > "${SHORTCUTS_DIR}/pxdesign" << PXDEOF
@@ -1744,6 +1780,213 @@ EOF
     chmod +x "${SHORTCUTS_DIR}/complexa"
 }
 
+# ─── RFD3 / Foundry (RosettaCommons) ─────────────────────────────────────────
+# Butcher et al. 2025. BSD-3-Clause. PyPI: `rc-foundry`. Replaces RFAA entirely
+# — no DGL, no SE3-Transformer, works on aarch64 / DGX Spark. Weights live
+# under BindMaster/weights/foundry.
+
+install_rfd3() {
+    print_step "Installing RFD3 (foundry)"
+
+    # Conda env: Py 3.12 + PyTorch 2.2+ (CUDA 12.x)
+    if env_exists bindmaster_rfd3; then
+        print_warn "Conda environment 'bindmaster_rfd3' already exists — skipping creation."
+    else
+        run_logged "Creating bindmaster_rfd3 env" \
+            "${CONDA_CMD}" create -n bindmaster_rfd3 -y python=3.12 pip \
+            -c conda-forge \
+            || { print_fail "Failed to create bindmaster_rfd3 env"; return 1; }
+    fi
+
+    # PyTorch (CUDA 12.1 wheels — works for 12.1–12.8 host drivers)
+    run_logged "Installing PyTorch (CUDA 12.1)" \
+        "${CONDA_CMD}" run -n bindmaster_rfd3 \
+        pip install -q "torch>=2.2" "torchvision" "torchaudio" --index-url https://download.pytorch.org/whl/cu121 \
+        || { print_fail "Failed to install PyTorch"; return 1; }
+
+    # foundry + rfd3 extra (PyPI package name is `rc-foundry`)
+    run_logged "Installing rc-foundry[rfd3] ${FOUNDRY_COMMIT}" \
+        "${CONDA_CMD}" run -n bindmaster_rfd3 \
+        pip install -q "rc-foundry[rfd3]==0.1.9" \
+        || { print_fail "Failed to install rc-foundry"; return 1; }
+
+    # Also install MPNN extra for post-diffusion sequence design (ProteinMPNN + LigandMPNN)
+    run_logged "Installing rc-foundry[mpnn]" \
+        "${CONDA_CMD}" run -n bindmaster_rfd3 \
+        pip install -q "rc-foundry[mpnn]==0.1.9" \
+        || print_warn "rc-foundry[mpnn] install failed — MPNN redesign step may not work"
+
+    # Download RFD3 weights to a shared location inside BindMaster
+    mkdir -p "${FOUNDRY_WEIGHTS_DIR}"
+    if [[ -n "$(ls -A "${FOUNDRY_WEIGHTS_DIR}" 2>/dev/null)" ]]; then
+        print_ok "Foundry weights dir already populated at ${FOUNDRY_WEIGHTS_DIR}"
+    else
+        run_logged "Downloading RFD3 weights (~few GB)" \
+            "${CONDA_CMD}" run -n bindmaster_rfd3 \
+            foundry install rfd3 --checkpoint-dir "${FOUNDRY_WEIGHTS_DIR}" \
+            || print_warn "RFD3 weight download failed — retry: conda run -n bindmaster_rfd3 foundry install rfd3 --checkpoint-dir ${FOUNDRY_WEIGHTS_DIR}"
+    fi
+
+    # Smoke test: rfd3 CLI help
+    smoke_test "RFD3 CLI check" \
+        "${CONDA_CMD}" run -n bindmaster_rfd3 rfd3 --help \
+        || print_warn "rfd3 CLI smoke test failed — env may need foundry weights first"
+
+    # Shortcut
+    _write_rfd3_shortcut
+
+    print_ok "RFD3 installation complete"
+    print_ok "  Usage: rfd3 design out_dir=./run inputs=config.yaml"
+}
+
+_write_rfd3_shortcut() {
+    mkdir -p "${SHORTCUTS_DIR}"
+    {
+        echo "#!/bin/bash"
+        echo "# RFD3 shortcut — runs 'rfd3 design ...' in the bindmaster_rfd3 env."
+        echo "# With no args: opens an interactive env shell."
+        echo ""
+        echo "CONDA_CMD=\"${CONDA_CMD}\""
+        echo "FOUNDRY_WEIGHTS_DIR=\"${FOUNDRY_WEIGHTS_DIR}\""
+    } > "${SHORTCUTS_DIR}/rfd3"
+    cat >> "${SHORTCUTS_DIR}/rfd3" << 'EOF'
+
+# Surface the weights dir as an env var for Hydra configs that need it
+export FOUNDRY_CHECKPOINT_DIR="${FOUNDRY_WEIGHTS_DIR}"
+
+if [[ $# -eq 0 ]]; then
+    echo "RFD3 environment (bindmaster_rfd3). Weights: ${FOUNDRY_WEIGHTS_DIR}"
+    echo "Examples:"
+    echo "  rfd3 design out_dir=./run inputs=examples/ppi.yaml"
+    echo "  foundry list-installed"
+    exec "${CONDA_CMD}" run --live-stream -n bindmaster_rfd3 bash
+fi
+
+exec "${CONDA_CMD}" run --live-stream -n bindmaster_rfd3 rfd3 "$@"
+EOF
+    chmod +x "${SHORTCUTS_DIR}/rfd3"
+}
+
+# ─── Protein-Hunter ──────────────────────────────────────────────────────────
+# Cho et al. (2025) bioRxiv 10.1101/2025.10.10.681530 — Boltz-2/Chai-1 structure
+# hallucination for protein / cyclic-peptide / small-molecule / DNA / RNA binders.
+# Upstream: github.com/yehlincho/Protein-Hunter
+
+install_protein_hunter() {
+    print_step "Installing Protein-Hunter"
+
+    # Clone at pinned commit
+    if [[ -d "${PROTEIN_HUNTER_DIR}" ]]; then
+        print_ok "Protein-Hunter already cloned at ${PROTEIN_HUNTER_DIR}"
+    else
+        run_logged "Cloning Protein-Hunter" \
+            git clone --depth 50 "${PROTEIN_HUNTER_REPO}" "${PROTEIN_HUNTER_DIR}" \
+            || { print_fail "Failed to clone Protein-Hunter"; return 1; }
+        git -C "${PROTEIN_HUNTER_DIR}" checkout "${PROTEIN_HUNTER_COMMIT}" --quiet \
+            || print_warn "Could not pin Protein-Hunter to ${PROTEIN_HUNTER_COMMIT} — using latest"
+    fi
+
+    # Conda env (Python 3.10 — matches upstream setup.sh)
+    if env_exists bindmaster_protein_hunter; then
+        print_warn "Conda environment 'bindmaster_protein_hunter' already exists — skipping creation."
+    else
+        print_step "Creating bindmaster_protein_hunter conda environment (Python 3.10)"
+        run_logged "Creating bindmaster_protein_hunter env" \
+            "${CONDA_CMD}" create -n bindmaster_protein_hunter -y python=3.10 pip \
+            -c conda-forge \
+            || { print_fail "Failed to create bindmaster_protein_hunter env"; return 1; }
+    fi
+
+    # Install PyTorch (matches upstream setup.sh expectations: torch>=2.2 with CUDA)
+    run_logged "Installing PyTorch (CUDA 12.1)" \
+        "${CONDA_CMD}" run -n bindmaster_protein_hunter \
+        pip install -q "torch>=2.2" "torchvision" "torchaudio" --index-url https://download.pytorch.org/whl/cu121 \
+        || { print_fail "Failed to install PyTorch"; return 1; }
+
+    # Install vendored Boltz_PH + upstream deps
+    run_logged "Installing Protein-Hunter Python deps" \
+        "${CONDA_CMD}" run -n bindmaster_protein_hunter bash -c \
+        "cd '${PROTEIN_HUNTER_DIR}' && pip install -q -e './boltz_ph' && pip install -q matplotlib seaborn prody py3Dmol pyyaml ml_collections biopython modelcif jaxtyping pandera logmd==0.1.45 pyrosetta-installer" \
+        || print_warn "Some Protein-Hunter deps failed — may need manual follow-up"
+
+    # PyRosetta (required by boltz_ph.design at import time)
+    run_logged "Installing PyRosetta" \
+        "${CONDA_CMD}" run -n bindmaster_protein_hunter python -c \
+        "from pyrosetta_installer import download_pyrosetta; download_pyrosetta(serialization=True, skip_if_installed=True)" \
+        || print_warn "PyRosetta install failed — Protein-Hunter design will not work until this is fixed"
+
+    # Install chai-lab (from sokrypton fork pinned by Protein-Hunter upstream)
+    run_logged "Installing Chai-1 (sokrypton fork)" \
+        "${CONDA_CMD}" run -n bindmaster_protein_hunter \
+        pip install -q "git+https://github.com/sokrypton/chai-lab.git" \
+        || print_warn "chai-lab install failed — only the Boltz-2 edition of Protein-Hunter will work"
+
+    # Weight sharing: reuse LigandMPNN weights from RFAA install if present.
+    # Protein-Hunter vendors LigandMPNN source in-repo but expects model_params/ locally.
+    local ph_mpnn_dir="${PROTEIN_HUNTER_DIR}/LigandMPNN/model_params"
+    if [[ -d "${LIGANDMPNN_DIR}/model_params" && ! -d "${ph_mpnn_dir}" ]]; then
+        mkdir -p "$(dirname "${ph_mpnn_dir}")"
+        ln -sfn "${LIGANDMPNN_DIR}/model_params" "${ph_mpnn_dir}"
+        print_ok "LigandMPNN weights → ${ph_mpnn_dir} (symlink to RFAA install)"
+    elif [[ ! -d "${ph_mpnn_dir}" ]]; then
+        if [[ -f "${PROTEIN_HUNTER_DIR}/LigandMPNN/get_model_params.sh" ]]; then
+            run_logged "Downloading LigandMPNN weights (Protein-Hunter)" \
+                bash -c "cd '${PROTEIN_HUNTER_DIR}/LigandMPNN' && bash get_model_params.sh ./model_params" \
+                || print_warn "LigandMPNN weights download failed — download manually"
+        fi
+    fi
+
+    # Boltz-2 weight cache (~/.boltz) — shared with Mosaic if Mosaic populates it first.
+    # Protein-Hunter pulls Boltz-2 weights on first run; we don't pre-download here.
+
+    # Smoke test: import boltz_ph package
+    smoke_test "Protein-Hunter import check" \
+        "${CONDA_CMD}" run -n bindmaster_protein_hunter bash -c \
+        "cd '${PROTEIN_HUNTER_DIR}' && python -c 'import boltz; print(\"boltz_ph import OK\")'" \
+        || print_warn "Protein-Hunter import failed — env may still work after first-use weight download"
+
+    # Shortcut
+    _write_protein_hunter_shortcut
+
+    print_ok "Protein-Hunter installation complete"
+    print_ok "  Usage: protein-hunter  (opens env shell)"
+    print_ok "         python boltz_ph/design.py --protein_seqs TARGET --num_designs N --name JOBNAME  (direct)"
+}
+
+_write_protein_hunter_shortcut() {
+    mkdir -p "${SHORTCUTS_DIR}"
+    {
+        echo "#!/bin/bash"
+        echo "# Protein-Hunter shortcut — activates bindmaster_protein_hunter conda env"
+        echo "# and opens an interactive shell in the Protein-Hunter directory."
+        echo ""
+        echo "PROTEIN_HUNTER_DIR=\"${PROTEIN_HUNTER_DIR}\""
+        echo "CONDA_CMD=\"${CONDA_CMD}\""
+    } > "${SHORTCUTS_DIR}/protein-hunter"
+    cat >> "${SHORTCUTS_DIR}/protein-hunter" << 'EOF'
+
+cd "${PROTEIN_HUNTER_DIR}"
+
+echo "Protein-Hunter environment (bindmaster_protein_hunter) activated."
+echo "Working directory: ${PROTEIN_HUNTER_DIR}"
+echo "Minimal protein binder run:"
+echo "  python boltz_ph/design.py --num_designs 50 --num_cycles 7 \\"
+echo "      --protein_seqs <TARGET_AA> --msa_mode mmseqs --gpu_id 0 \\"
+echo "      --name JOBNAME --min_protein_length 90 --max_protein_length 150 \\"
+echo "      --high_iptm_threshold 0.7 --percent_X 80"
+echo ""
+echo "Modalities (flags on design.py):"
+echo "  --cyclic                  cyclic peptide binder"
+echo "  --ligand_ccd CCD          small-molecule binder (CCD code)"
+echo "  --ligand_smiles 'SMILES'  small-molecule binder (SMILES)"
+echo "  --nucleic_seq SEQ --nucleic_type dna|rna    DNA / RNA binder"
+echo ""
+
+exec "${CONDA_CMD}" run --live-stream -n bindmaster_protein_hunter bash
+EOF
+    chmod +x "${SHORTCUTS_DIR}/protein-hunter"
+}
+
 # ─── Uninstall ─────────────────────────────────────────────────────────────────
 
 uninstall_tool() {
@@ -1791,7 +2034,8 @@ uninstall_tool() {
             print_step "Uninstalling Evaluator"
             env_exists binder-eval && run_logged "Removing binder-eval conda env" \
                 "${CONDA_CMD}" env remove -n binder-eval -y
-            env_exists binder-eval-af2 && run_logged "Removing binder-eval-af2 conda env" \
+            # Legacy binder-eval-af2 env (from pre-refactor installs): remove if present
+            env_exists binder-eval-af2 && run_logged "Removing legacy binder-eval-af2 conda env" \
                 "${CONDA_CMD}" env remove -n binder-eval-af2 -y
             rm -f "${EVALUATOR_DIR}/envs/mosaic_venv_path"
             rm -f "${SHORTCUTS_DIR}/evaluate"
@@ -1824,6 +2068,23 @@ uninstall_tool() {
             [[ -d "${PROTEINA_COMPLEXA_DIR}" ]] && { rm -rf "${PROTEINA_COMPLEXA_DIR}"; print_ok "Removed ${PROTEINA_COMPLEXA_DIR}"; }
             print_ok "Proteina-Complexa uninstalled"
             ;;
+        protein-hunter|protein_hunter|phunter)
+            print_step "Uninstalling Protein-Hunter"
+            env_exists bindmaster_protein_hunter && run_logged "Removing bindmaster_protein_hunter env" \
+                "${CONDA_CMD}" env remove -n bindmaster_protein_hunter -y
+            rm -f "${SHORTCUTS_DIR}/protein-hunter"
+            [[ -d "${PROTEIN_HUNTER_DIR}" ]] && { rm -rf "${PROTEIN_HUNTER_DIR}"; print_ok "Removed ${PROTEIN_HUNTER_DIR}"; }
+            print_ok "Protein-Hunter uninstalled"
+            ;;
+        rfd3|foundry)
+            print_step "Uninstalling RFD3"
+            env_exists bindmaster_rfd3 && run_logged "Removing bindmaster_rfd3 env" \
+                "${CONDA_CMD}" env remove -n bindmaster_rfd3 -y
+            rm -f "${SHORTCUTS_DIR}/rfd3"
+            [[ -d "${FOUNDRY_WEIGHTS_DIR}" ]] && { rm -rf "${FOUNDRY_WEIGHTS_DIR}"; print_ok "Removed ${FOUNDRY_WEIGHTS_DIR}"; }
+            [[ -d "${FOUNDRY_DIR}" ]] && { rm -rf "${FOUNDRY_DIR}"; print_ok "Removed ${FOUNDRY_DIR}"; }
+            print_ok "RFD3 uninstalled"
+            ;;
         *)
             print_fail "Unknown tool: ${tool}"
             return 1
@@ -1882,6 +2143,8 @@ main() {
         [[ "${DO_RFAA}"      == true ]] && { uninstall_tool rfaa      || failed_uninstalls+=("RFAA"); }
         [[ "${DO_PXDESIGN}"  == true ]] && { uninstall_tool pxdesign  || failed_uninstalls+=("PXDesign"); }
         [[ "${DO_PROTEINA_COMPLEXA}" == true ]] && { uninstall_tool proteina-complexa || failed_uninstalls+=("Proteina-Complexa"); }
+        [[ "${DO_PROTEIN_HUNTER}" == true ]] && { uninstall_tool protein-hunter || failed_uninstalls+=("Protein-Hunter"); }
+        [[ "${DO_RFD3}"      == true ]] && { uninstall_tool rfd3      || failed_uninstalls+=("RFD3"); }
 
         # Offer to remove local Miniforge when all tools are uninstalled
         if [[ "${DO_BINDCRAFT}" == true && "${DO_BOLTZGEN}" == true && \
@@ -1916,6 +2179,8 @@ main() {
     [[ "${DO_RFAA}"      == true ]] && (( total++ ))
     [[ "${DO_PXDESIGN}"  == true ]] && (( total++ ))
     [[ "${DO_PROTEINA_COMPLEXA}" == true ]] && (( total++ ))
+    [[ "${DO_PROTEIN_HUNTER}" == true ]] && (( total++ ))
+    [[ "${DO_RFD3}"      == true ]] && (( total++ ))
 
     local failed_tools=()
     FAILED_EXAMPLES=()   # populated by install functions on example failure
@@ -1924,9 +2189,11 @@ main() {
     [[ "${DO_BOLTZGEN}"  == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] BoltzGen${RESET}";  install_boltzgen  || failed_tools+=("BoltzGen");  }
     [[ "${DO_MOSAIC}"    == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] Mosaic${RESET}";    install_mosaic    || failed_tools+=("Mosaic");    }
     [[ "${DO_EVALUATOR}" == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] Evaluator${RESET}"; install_evaluator || failed_tools+=("Evaluator"); }
-    [[ "${DO_RFAA}"      == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] RFAA${RESET}";      install_rfaa      || failed_tools+=("RFAA"); }
+    [[ "${DO_RFAA}"      == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] RFAA (legacy)${RESET}"; install_rfaa || failed_tools+=("RFAA"); }
+    [[ "${DO_RFD3}"      == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] RFD3${RESET}";      install_rfd3      || failed_tools+=("RFD3"); }
     [[ "${DO_PXDESIGN}"  == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] PXDesign${RESET}";  install_pxdesign  || failed_tools+=("PXDesign"); }
     [[ "${DO_PROTEINA_COMPLEXA}" == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] Proteina-Complexa${RESET}"; install_proteina_complexa || failed_tools+=("Proteina-Complexa"); }
+    [[ "${DO_PROTEIN_HUNTER}" == true ]] && { (( step++ )); echo -e "\n${BOLD}[${step}/${total}] Protein-Hunter${RESET}"; install_protein_hunter || failed_tools+=("Protein-Hunter"); }
 
     echo ""
     echo -e "${BOLD}=== Installation Summary ===${RESET}"
@@ -1951,9 +2218,11 @@ main() {
     [[ "${DO_BOLTZGEN}"  == true ]] && echo -e "  ${GREEN}boltzgen${RESET}   — open BoltzGen shell"
     [[ "${DO_MOSAIC}"    == true ]] && echo -e "  ${GREEN}mosaic${RESET}     — open Mosaic shell"
     [[ "${DO_EVALUATOR}" == true ]] && echo -e "  ${GREEN}evaluate${RESET}   — launch evaluation wizard"
-    [[ "${DO_RFAA}"      == true ]] && echo -e "  ${GREEN}rfaa${RESET}       — open RFAA shell"
+    [[ "${DO_RFAA}"      == true ]] && echo -e "  ${YELLOW}rfaa${RESET}       — open RFAA shell ${YELLOW}(legacy)${RESET}"
+    [[ "${DO_RFD3}"      == true ]] && echo -e "  ${GREEN}rfd3${RESET}       — run RFD3 design / open env shell"
     [[ "${DO_PXDESIGN}"  == true ]] && echo -e "  ${GREEN}pxdesign${RESET}   — open PXDesign shell"
     [[ "${DO_PROTEINA_COMPLEXA}" == true ]] && echo -e "  ${GREEN}complexa${RESET}   — open Proteina-Complexa shell"
+    [[ "${DO_PROTEIN_HUNTER}" == true ]] && echo -e "  ${GREEN}protein-hunter${RESET} — open Protein-Hunter shell"
     # Add shortcuts dir to PATH in .bashrc (idempotent)
     local path_line="export PATH=\"${SHORTCUTS_DIR}:\$PATH\""
     if ! grep -qF "${SHORTCUTS_DIR}" "${HOME}/.bashrc" 2>/dev/null; then
diff --git a/install/install_aarch.sh b/install/install_aarch.sh
index 51c066d..a68c718 100755
--- a/install/install_aarch.sh
+++ b/install/install_aarch.sh
@@ -546,7 +546,7 @@ select_tools_interactive() {
         "Binder design via AlphaFold2 (conda, Python 3.10)"
         "Structure generation with Boltz-1 (conda, Python 3.12)"
         "JAX-based protein design with Marimo notebooks (uv venv)"
-        "Evaluate binders: refold with Boltz-2 + AF2, ranked report (requires Mosaic)"
+        "Evaluate binders: refold with Boltz-2 (+ Protenix, AF3 on DGX Spark), ranked report (requires Mosaic)"
         "All-atom diffusion + LigandMPNN (${RED}NOT SUPPORTED on aarch64${RESET} — DGL lacks CUDA)"
         "Protenix-based de novo binder design (conda)"
     )
@@ -1326,23 +1326,9 @@ install_evaluator() {
         "${CONDA_CMD}" run -n binder-eval pip install -q -e "${EVALUATOR_DIR}[report]" \
         || { print_fail "Failed to install binder-compare into binder-eval"; return 1; }
 
-    # binder-eval-af2 conda env (AF2 refolding via ColabDesign)
-    print_step "Creating binder-eval-af2 conda environment (Python 3.10)"
-    if env_exists binder-eval-af2; then
-        print_warn "Conda environment 'binder-eval-af2' already exists — skipping creation."
-    else
-        run_logged "Creating binder-eval-af2 conda env" \
-            "${CONDA_CMD}" env create -f "${EVALUATOR_DIR}/envs/binder-eval-af2.yml" -y \
-            || { print_fail "Failed to create binder-eval-af2 conda env"; return 1; }
-    fi
-    run_logged "Installing ColabDesign + binder-compare into binder-eval-af2" \
-        "${CONDA_CMD}" run -n binder-eval-af2 pip install -q "colabdesign @ git+https://github.com/sokrypton/ColabDesign.git" -e "${EVALUATOR_DIR}[af2]" \
-        || { print_fail "Failed to install packages into binder-eval-af2"; return 1; }
-
-    # JAX CUDA plugin — ColabDesign/AF2 uses JAX; on aarch64 the default jaxlib is CPU-only.
-    run_logged "Installing JAX CUDA plugin into binder-eval-af2" \
-        "${CONDA_CMD}" run -n binder-eval-af2 pip install -q "jax[cuda]" \
-        || { print_fail "Failed to install JAX CUDA plugin"; return 1; }
+    # (AF2 refolding was removed in the AF3/Protenix refactor; the
+    #  binder-eval-af2 env is no longer created. AF3 refolding on DGX
+    #  Spark / aarch64 is installed separately via `install_af3`.)
 
     # Smoke test
     smoke_test "binder-compare --help" \
@@ -1355,7 +1341,6 @@ install_evaluator() {
     print_ok "Shortcut installed at ${SHORTCUTS_DIR}/evaluate"
 
     print_ok "Evaluator installation complete"
-    print_ok "  AF2 weights (~4 GB) must be at \$AF2_DATA_DIR — see Evaluator/docs/pipeline_reference.md"
 }
 
 _write_evaluator_shortcut() {
@@ -1641,6 +1626,14 @@ PATCHEOF
         "${CONDA_CMD}" run -n bindmaster_pxdesign python -c "import torch; print('PXDesign env OK')" \
         || return 1
 
+    # Install binder-compare into the PXDesign env so Protenix refolding
+    # (Part J) can run via `conda run -n bindmaster_pxdesign binder-compare refold-protenix`.
+    if [[ -d "${EVALUATOR_DIR}" ]]; then
+        run_logged "Installing binder-compare into bindmaster_pxdesign (for Protenix refold)" \
+            "${CONDA_CMD}" run -n bindmaster_pxdesign pip install -q -e "${EVALUATOR_DIR}[report]" \
+            || print_warn "binder-compare install into bindmaster_pxdesign failed — Protenix refolding will be unavailable"
+    fi
+
     # Shortcut
     mkdir -p "${SHORTCUTS_DIR}"
     cat > "${SHORTCUTS_DIR}/pxdesign" << PXDEOF
@@ -1697,7 +1690,8 @@ uninstall_tool() {
             print_step "Uninstalling Evaluator"
             env_exists binder-eval && run_logged "Removing binder-eval conda env" \
                 "${CONDA_CMD}" env remove -n binder-eval -y
-            env_exists binder-eval-af2 && run_logged "Removing binder-eval-af2 conda env" \
+            # Legacy binder-eval-af2 env (from pre-refactor installs): remove if present
+            env_exists binder-eval-af2 && run_logged "Removing legacy binder-eval-af2 conda env" \
                 "${CONDA_CMD}" env remove -n binder-eval-af2 -y
             rm -f "${EVALUATOR_DIR}/envs/mosaic_venv_path"
             rm -f "${SHORTCUTS_DIR}/evaluate"