Skip to content

Port to nanobind and restructure for performance#71

Merged
DomFijan merged 253 commits intomainfrom
nanobind4
Feb 12, 2026
Merged

Port to nanobind and restructure for performance#71
DomFijan merged 253 commits intomainfrom
nanobind4

Conversation

@janbridley
Copy link
Copy Markdown
Contributor

@janbridley janbridley commented Nov 22, 2025

Description

This is a combination rewrite/refactor that aims to (1) replace pybind11 code with nanobind and (2) restructure data types and the code layout to interface better with cymmetry (upcoming library & publication) and new optimizer developments.

The biggest change is a separation of python interfaces from c++ source code -- bindings are consolidated to separate files, and much of the library is now available as headers for use in other tools. The area of the interface is also reduced, with nanobind's ndarray replacing python exports for Vec3 and Quaternion types exported in previous versions. Finally, the data classes (PGOP- and BOOSOPStore) have been removed. These adapters were added in previous versions for performance reasons, but nanobind allows for copy-free array translation that removes this necessity. These changes, combined with a few other optimizations, result in a 400-500% performance increase across the board.

Changed

  • Project now uses C++ 20 (primarily for std::span). This is great for ergonomics and makes a future Eigen3 port much easier (Eigen::Map is very similar)
  • Nanobind exports replace pybind11
  • Optimizers are now header-only
  • Locality code is now header-only
  • Vec3 and Quaternion are now header-only
  • BondOrder is now header-only
  • Metrics and Utils (excluding QlmEval) are now header only
  • pgop.py::BOOSOP code is now in separate file boosop.py.
  • Many std::vector<std::vector<...>> are now vectors of pointers, allowing for copy- and move- free access to python data. Matrix elements are accessed with std::span and cast to statically-allocated types for performance.
  • py::array are now replaced with std::vector or type* pointers
  • Implied rotation matrix type (std::vector<double>) is now typedef RotationMatrix = std::array<double, 9>

Removed

  • PGOPStore
  • BOOSOPStore
  • Unused python bindings (quaternion, vec3, QLMEval, metrics)

Added

  • m_group_sizes class method for PGOP, which stores the size of each group (currently, (group order - 1) * 9). Previous code used vector.size, which requires copies and allocations for both individual elements and entire groups.
  • RotationMatrix std::array wrapper for fast and strongly typed vector rotations
  • -DENABLE_PROFILING flag to allow for easy profiling

Benchmarking

uv pip install . --config-settings=cmake.args="-DENABLE_PROFILING=ON"  --config-settings=cmake.build-type="RelWithDebInfo"

Before this PR

Compute PGOP for mesh of 600 points, computed for an icosahedron (N=12, N_query=1):

--- Benchmarking C2 symmetry ---
  PGOP: 0.9031 ± 0.0057 (mean ± std. dev.)
  Time: 0.59 μs ± 0.01 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking D5d symmetry ---
  PGOP: 0.8648 ± 0.0413 (mean ± std. dev.)
  Time: 8.40 μs ± 0.03 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking T symmetry ---
  PGOP: 0.8623 ± 0.0286 (mean ± std. dev.)
  Time: 4.91 μs ± 0.02 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking Ih symmetry ---
  PGOP: 0.8365 ± 0.0506 (mean ± std. dev.)
  Time: 51.63 μs ± 0.45 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

Same approach, just with mode = "boo" and sigma=177.7 (~kappa = 0.075):

 ~ spatula-analysis==0.1.1 (from file:///Users/jenna/github/spatula)
--- Benchmarking C2 symmetry ---
  PGOP: 0.8981 ± 0.0104 (mean ± std. dev.)
  Time: 0.97 μs ± 0.05 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking D5d symmetry ---
  PGOP: 0.8231 ± 0.0657 (mean ± std. dev.)
  Time: 16.03 μs ± 0.11 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking T symmetry ---
  PGOP: 0.8263 ± 0.0487 (mean ± std. dev.)
  Time: 9.30 μs ± 0.10 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking Ih symmetry ---
  PGOP: 0.7805 ± 0.0816 (mean ± std. dev.)
  Time: 99.19 μs ± 0.69 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

After this PR

Compute PGOP for mesh of 600 points, computed for an icosahedron (N=12, N_query=1):

--- Benchmarking C2 symmetry ---
  PGOP: 0.90312034 ± 0.00567270 (mean ± std. dev.)
  Time: 0.17 μs ± 0.02 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking D5d symmetry ---
  PGOP: 0.86482608 ± 0.04126538 (mean ± std. dev.)
  Time: 1.77 μs ± 0.02 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking T symmetry ---
  PGOP: 0.86229689 ± 0.02857537 (mean ± std. dev.)
  Time: 1.04 μs ± 0.02 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking Ih symmetry ---
  PGOP: 0.83650060 ± 0.05056483 (mean ± std. dev.)
  Time: 10.67 μs ± 0.10 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

With mode="boo" and the same sigma/kappa conversion

--- Benchmarking C2 symmetry ---
  PGOP: 0.89811896 ± 0.01037498 (mean ± std. dev.)
  Time: 0.17 μs ± 0.01 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking D5d symmetry ---
  PGOP: 0.82308454 ± 0.06566185 (mean ± std. dev.)
  Time: 1.50 μs ± 0.03 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking T symmetry ---
  PGOP: 0.82627869 ± 0.04871954 (mean ± std. dev.)
  Time: 0.91 μs ± 0.02 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

--- Benchmarking Ih symmetry ---
  PGOP: 0.78053632 ± 0.08155272 (mean ± std. dev.)
  Time: 8.90 μs ± 0.07 per trial(mean ± std. dev. of 10 runs, 50 orientations each)

Motivation and Context

Resolves: #???

How Has This Been Tested?

Checklist:

Comment thread benchmarks/microbenchmark.py
@janbridley
Copy link
Copy Markdown
Contributor Author

janbridley commented Feb 10, 2026

Note that 46affaf was a big merge containing the 2026 header updates -- I need to make sure I didn't break anything

@janbridley
Copy link
Copy Markdown
Contributor Author

Note that 46affaf was a big merge containing the 2026 header updates -- I need to make sure I didn't break anything

Should be solved in 5287386

Comment thread src/nanobind/export-boosop.cc Outdated
@janbridley janbridley mentioned this pull request Feb 11, 2026
4 tasks
Comment thread src/nanobind/export-boosop.cc
Comment thread src/nanobind/export-pgop.h Outdated
Comment thread src/locality.h Outdated
Comment thread src/locality.h Outdated
@janbridley
Copy link
Copy Markdown
Contributor Author

@DomFijan Somehow this works only on macOS -- do you get more debug output on your end?

@DomFijan DomFijan merged commit fa5b184 into main Feb 12, 2026
28 checks passed
@DomFijan DomFijan deleted the nanobind4 branch February 12, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants