Skip to content

elkins-lab/diff-biophys

🧬 Diff-Biophys: Differentiable Biophysics for the AI Era

Tests PyPI version Python 3.10+ License: MIT codecov JAX Ruff Checked with mypy OpenSSF Best Practices

Diff-Biophys is a high-performance Python library for differentiable biophysical modeling. Built on JAX, it re-implements core structural biology and spectroscopy observables (SAXS, NMR, CD) as hardware-accelerated, auto-differentiable kernels.

Documentation Website | Use Cases | Tutorials


🎯 Vision

To bridge the gap between static structural models and experimental solution-state data by providing a "differentiable bridge." This allows researchers to:

  1. Optimize protein structures directly against experimental spectra via gradient descent.
  2. Train machine learning models using physics-informed loss functions.
  3. Accelerate large-scale biophysical simulations on GPUs and TPUs.

🌉 The Interdisciplinary Bridge

diff-biophys sits at the intersection of Machine Learning and Structural Biology. If you find the terminology confusing, please read our Concepts & Context Guide! It acts as a "Rosetta Stone" to explain:

  • For ML Engineers: What SAXS and NMR are, and why traditional physics code can't be used in PyTorch/JAX loss functions.
  • For Biologists: What automatic differentiation is, why JAX is used instead of traditional Monte Carlo/Simulated Annealing, and how it enables optimization on GPUs.

📚 Interactive Tutorials

Experience Diff-Biophys directly in your browser with our Colab tutorials:

Tutorial Audience Description Action
🎓 Hello, Gradient Descent! Undergrad (any) No biology needed. Learn what a gradient is, how gradient descent works, and how JAX computes gradients automatically — then fit a real Karplus curve. Open In Colab
🔬 NMR Fundamentals Undergrad (bio/chem) Chemical shifts, the Karplus equation, RDCs, and the magic angle — computed differentiably and connected back to protein backbone torsion angles. Open In Colab
🧬 Protein Folding Undergrad (bio/chem) Use the NeRF algorithm to build 3D structures from angles and "fold" a random coil into an α-helix using RDC and Chemical Shift gradients. Open In Colab
🌉 Hybrid Refinement Graduate / researcher Use an energy-minimized structure from synth-pdb as a starting point and refine it against experimental gradients to rescue a decoy. Open In Colab
💡 CD Spectroscopy Undergrad (bio/chem) Build an α-helix from scratch, simulate its CD spectrum via the DeVoe model, watch it change as the helix unwinds, and compute the gradient of [θ]₂₂₂ w.r.t. atomic positions. Open In Colab
🧪 Diff-Biophys Showcase Graduate / researcher A complete overview of the JAX-accelerated SAXS and NMR kernels. Open In Colab
⚗️ Structure Refinement Lab Graduate / researcher Use gradient descent to optimize protein structures against experimental SAXS profiles. Open In Colab

⛰️ Real-World Considerations: The Rugged Landscape

Protein conformational space is notoriously rugged, filled with countless local minima ("traps") that can catch a simple gradient descent optimizer. It is important to be realistic about where differentiable physics excels and where it has limitations.

1. The Local Minimum Problem

Because gradient descent (the core of this library) follows the mathematically steepest path, it will always slide into the nearest "valley." If your starting structure is very far from the correct fold (e.g., a random string of atoms), the optimizer may get stuck in a physically impossible or non-native local minimum.

2. The "Experimental Funnel"

Differentiable physics is most powerful when combined with experimental data (RDCs, SAXS, FSC). These observables act like a "global gravitational pull." Because a SAXS curve or an RDC depends on the entire shape of the molecule, they create a much wider and smoother "basin of attraction" than pure physical forces (like hydrogen bonds), helping the optimizer cross small physical "bumps" in the landscape.

3. The Recommended Hybrid Workflow

For complex proteins, we do not recommend "folding from scratch." Instead, use a Hybrid Refinement strategy:

  1. Initial State: Use AlphaFold or synth-pdb to generate a "physically plausible" starting structure that is in the correct global neighborhood.
  2. Gradient Descent: Use diff-biophys to surgically "slide" that structure down the final few inches to match the exact experimental solution data.

4. What is "Adam"?

In our tutorials, we use an optimizer called Adam (Adaptive Moment Estimation). Think of it as "gradient descent with memory and friction." Unlike a simple ball rolling down a hill, Adam:

  • Momentum: Remembers its previous speed to help roll over small local dips.
  • Adaptive Steps: Automatically slows down in steep areas and speeds up in flat plains, making it much more robust than standard gradient descent.

1. diff_biophys.geometry (Differentiable Structural Engine)

  • NeRF (Natural Extension Reference Frame): Differentiable conversion from internal coordinates ($\phi, \psi, \omega$, bond lengths/angles) to Cartesian XYZ.
  • Kabsch Alignment: Differentiable optimal superposition using SVD.
  • Torsion Analysis: Vectorized calculation of all backbone and side-chain dihedrals.
  • Macroscopic Properties: Differentiable Radius of Gyration ($R_g$) for driving compaction/expansion during structural optimization.

2. diff_biophys.saxs (Differentiable Scattering)

  • Debye Formula: $O(N^2)$ inter-atomic interference summation.
  • Hydration Shell Correction: Excluded-volume solvent subtraction (Fraser et al. 1978).
  • Hardware Acceleration: GPU-optimized pairwise distance kernels via JAX vmap.
  • Use Case: Fitting structure compactness and radius of gyration to solution-state X-ray scattering curves.

3. diff_biophys.nmr (Differentiable Spectroscopy)

  • Residual Dipolar Couplings (RDCs): Differentiable Saupe tensor alignment and coupling calculation. Includes SVD-based tensor fitting.
  • Chemical Shifts: Differentiable ring-current (Johnson-Bovey) shielding and softmax-weighted secondary structure Cα shift predictor.
  • Karplus J-coupling: Parameterizable 3J coupling equation (Vuister & Bax 1993 defaults).
  • Use Case: Refining side-chain packing and domain orientations against high-resolution NMR data.

4. diff_biophys.cd (Differentiable Dichroism)

  • Matrix-Method Simulation: Differentiable simulation of peptide bond transition dipole coupling via DeVoe theory.
  • Status: ✅ Implemented. Supports frequency-dependent coupled-oscillator response.

⚡ Technical Architecture

  • Backend: JAX (XLA-compiled) — supports CPU, GPU, and TPU.
  • Parallelism: Native support for vmap (vectorization across ensembles/trajectories) and pmap (multi-device execution).
  • Differentiability: Forward and reverse-mode autodiff through all kernels.
  • Interoperability: JAX arrays are compatible with NumPy and can be exchanged with PyTorch via dlpack (user-managed conversion).

🧪 Scientific Validation

DiffBiophys is validated against foundational biophysical principles and analytical solutions to ensure physical realism:

  • SAXS Guinier Approximation: Recovers correct $R_g$ from low-q scattering slopes (test_saxs_guinier.py).
  • SAXS Analytic Sphere: Reproduces the theoretical scattering profile of a uniform sphere (test_science_saxs_sphere.py).
  • SAXS Kratky Topology: Correctly distinguishes between globular and unfolded topologies via Kratky plot signatures (test_science_saxs_kratky.py).
  • SAXS $P(r)$ Distribution: Matches analytical pair-distance distribution for spheres (Guinier 1939) with $>0.98$ correlation (test_science_saxs_pr.py).
  • NMR RDC Physics: Verified 1/r³ distance scaling and $(3\cos^2\theta - 1)$ angular dependence, including zero coupling at the Magic Angle (test_science_rdc_angular.py).
  • NMR Ring Currents: Reproduces shielding/deshielding cones of the Johnson-Bovey model (test_science_ring_currents.py).

🚀 Roadmap

Phase 1: Foundations (Alpha)

  • Differentiable NeRF and Kabsch alignment.
  • GPU-accelerated Debye formula for SAXS with hydration shell correction.
  • Unit tests verifying parity with synth-pdb NumPy implementations.

Phase 2: NMR & Spectroscopy (Beta)

  • Differentiable RDC and Karplus kernels.
  • Differentiable Johnson-Bovey ring current model.
  • Integration with synth-nmr parameter libraries (optional dependency).

Phase 3: Integration & Optimization (v1.0)

  • Full CD matrix-method implementation (DeVoe theory).
  • Example notebooks for structure refinement via gradient descent.
  • Plugin for torch-based AI models to use biophysical loss functions (torch_interop).
  • Full support for BinaryCIF streaming.

Phase 4: Advanced Ensembles & Dynamics (Future)

  • Differentiable Side-chain Packing: Move beyond backbone-only refinement.
  • Ensemble Reweighting: JAX-optimized Maximum Entropy fitting for large IDP trajectories.
  • Multiplexed Loss Functions: Jointly optimize structures against SAXS, NMR, and CD simultaneously.
  • Experimental Noise Modeling: Differentiable Bayesian treatment of experimental uncertainty.

📂 Repository Structure

diff-biophys/
├── diff_biophys/          # Core package
│   ├── geometry/          # NeRF, Kabsch, Torsions
│   ├── saxs/              # Debye kernels, form factors
│   ├── nmr/               # RDCs, Karplus, Ring Currents, Chemical Shifts
│   ├── cd/                # CD simulation (DeVoe Matrix Method)
│   └── ensemble.py        # Ensemble averaging API
├── tests/                 # Parity, gradient, and scientific validation checks
├── examples/interactive_tutorials/              # Jupyter notebooks (Refinement Lab)
├── docs/                  # API and Theory
├── pyproject.toml         # Modern build config
└── README.md

🚀 Installation

pip install diff-biophys

For GPU support (CUDA):

pip install "jax[cuda12]" diff-biophys

🤝 Contributing

Contributions are welcome from both ML and structural biology communities! Please open an issue or pull request on GitHub. Run pre-commit run --all-files before submitting.

🔗 Related Projects

diff-biophys is the differentiable engine powering the higher-level tools in this ecosystem:

  • synth-pdb — Synthetic structure generation (uses NumPy implementations)
  • synth-nmr — NMR observables (optional dependency)
  • synth-saxs — SAXS profile simulator
  • diff-fret — Differentiable FRET (new)
  • diff-hdx — Differentiable HDX-MS (new)
  • diff-epr — Differentiable EPR/DEER (new)
  • diff-ensemble — IDP ensemble VAE (depends on diff-biophys)
  • torsion-tuner — GNN refinement (depends on diff-biophys)
  • resonance-flow — NMR-guided folding (depends on diff-biophys)

🎓 Learning JAX

Since Diff-Biophys is built entirely on JAX, we highly recommend these resources to get the most out of the library:

  • JAX 101: The official "must-read" introduction to the JAX functional mindset.
  • 🔪 JAX - The Sharp Bits: A mandatory guide on common pitfalls (like immutable arrays and pure functions).
  • JAX, M.D.: The landmark paper on differentiable physics that inspired much of this work.
  • Optax Documentation: Learn how to use advanced optimizers (like Adam) that we use in our tutorials.
  • Equinox: A great library for those who prefer a more PyTorch-like, object-oriented style in JAX.

⚖️ License

MIT License — see LICENSE for details.

📖 Citation

@software{diff_biophys,
  author  = {Elkins, George},
  title   = {diff-biophys: Differentiable biophysics kernels for JAX},
  year    = {2026},
  url     = {https://github.com/elkins-lab/diff-biophys},
  version = {0.1.6}
}