This repo synthesizes, simulates, verifies, and validates reversible quantum
circuits for elliptic-curve point addition -- the inner loop of Shor's attack on
ECDLP -- and scores them against the Google Quantum AI ECC whitepaper's two
operating points for one secp256k1 P += Q: <= 1175 logical qubits / <= 2.7M
Toffoli (Clow-qubit) and <= 1425 qubits / <= 2.1M Toffoli (Clow-gate)
(papers/cryp_paper.txt).
trailmix/-- the in-house toolchain: a reversible-circuit IR, a 64-shot parallel simulator, an abstract-interpretation phase-correctness gate, a spooky-pebble ghost API, a time-travel debugger, a per-section profiler, and the EC-add circuits themselves (secp256k1 plus Curve25519 / SM2 / Brainpool).zkp_ecc_zenodo/-- the native zkp_ecc Simulator (the same one compiled into the SP1 zkVM guest) that replays a.kmxcircuit and asserts its resource counts over thousands of Fiat-Shamir shots.papers/-- the cited prior art (Schrottenloher, Proos-Zalka, Roetteler, Cuccaro, Gidney, Khattar-Gidney, the Google whitepaper).kmx_circuit_summaries.md-- the per-circuit executive summaries;trailmix/notes/-- design notes and replication status.
This work's five circuits against the published prior art and Google's two cost targets, as logical resources for a full 256-bit ECDLP attack (each point-add cost extrapolated through windowed scalar multiplication):
The competitive region, on log and then linear axes:
This work and the two cost targets are secp256k1-specific (its special prime makes the modular arithmetic cheaper); most prior-work points are general 256-bit ECDLP estimates, so the cross-comparison is indicative.
Regenerate with python3 plot.py.
All five compute the same in-place affine P += Q on secp256k1 (P in quantum
registers, Q classical per shot). They differ only in the modular-inversion
engine and where they sit on the qubit/Toffoli exchange curve. Full,
citation-dense writeups (algorithm, cost, peak, prior-art basis, and the specific
optimizations layered on top) are in
kmx_circuit_summaries.md.
| config | emit bin | inversion engine | qubits | Toffoli | based on |
|---|---|---|---|---|---|
| low-qubit | emit_test_ec_add_schrottenloher |
Schrottenloher EEA dialog, dialog_m=5 base-3 pack, no venting |
1173 | ~2.48M | Schrottenloher arXiv:2606.02235 (Google Clow-qubit 1175q / 2.7M) |
| low-tof | emit_test_ec_add_schrottenloher_lowtof |
same dialog, dialog_m=3 Fig.1 pack plus coupled venting |
1416 | ~2.03M | Schrottenloher arXiv:2606.02235 (Google Clow-gate 1425q / 2.1M) |
| jump-lowqubit | emit_test_ec_add_schrottenloher_jump |
jump-GCD dialog (Stein-style, jump=2), base-5 pack, no venting |
1169 | ~2.09M | Schrottenloher arXiv:2606.02235 + Stein binary GCD |
| jump-lowtof | emit_test_ec_add_schrottenloher_jump_lowtof |
same jump-GCD dialog plus coupled venting | 1412 | ~1.90M | Schrottenloher arXiv:2606.02235 + Stein binary GCD |
| shrunken-PZ | emit_test_ec_add_shrunken_pz |
reversible Proos-Zalka divstep state machine | 1050 | ~32.3M | Proos-Zalka 2003 (arXiv:quant-ph/0301141); Roetteler lambda-cancel (arXiv:1706.06752) |
The first four share the Schrottenloher EEA-dialog code path (Andre Schrottenloher,
"Optimized Point Addition Circuits for Elliptic Curve Discrete Logarithms,"
arXiv:2606.02235), which obtains 1/dx by running the in-place modular-multiply
circuit backwards. They turn two independent knobs:
- Inversion: the divstep dialog (low-qubit, low-tof) reduces the binary-GCD pair
one bit per step; the jump-GCD dialog (jump-lowqubit, jump-lowtof) is a
Stein-style variant that removes up to
jump=2trailing zeros ofvper step. Fewer steps means the per-step adders (the GCD subtract/swap and theapply_bvmod-add) fire fewer times, for a Toffoli saving at roughly unchanged tape length -- each step records its shift count (three windows packed into a 7-bit base-5 code), and the step reduction more than pays for that extra bit. - Venting: the low-tof variants spend ~240 more qubits on
apply_bv-phase Gidney measure-and-fixup venting to drop Toffoli further.
So jump-lowqubit is the low-qubit corner with the cheaper inversion: at 1169q /
2.09M it sits under both Google operating points on both axes (below Clow-qubit's
1175q / 2.7M and Clow-gate's 1425q / 2.1M). jump-lowtof is the low-tof corner the
same way (1412q / 1.90M, under Clow-gate on both axes).
shrunken-PZ is a different family: a reversible divstep state machine (the
Proos-Zalka 2003 extended-Euclidean inversion made reversible) that materializes
the slope lambda and cancels it with an alt-witness second division (Roetteler
slope-invariance, two inversions per add against Roetteler's four). It reaches the
qubit-minimal corner at 1050 qubits for a Toffoli-maximal cost (~32.3M Toffoli,
~100M ops).
Shared prior art across all five: Cuccaro adders (arXiv:quant-ph/0410184), Gidney measure-and-fixup uncomputation (Quantum 2, 74, 2018), the Gidney constant adder (arXiv:2507.23079), Khattar-Gidney prefix-AND / log*-ancilla MCX (arXiv:2407.17966), and Vandaele comparators (arXiv:2603.12917, Thm 3), with the Google ECC whitepaper as the cost target.
The pipeline emits each EC-add circuit to a .kmx file and replays it through the
native zkp_ecc Simulator with explicit qubit / Toffoli / op caps:
./gen_and_validate_kmx.shgen_and_validate_kmx.sh:
- Emits the five EC-add
.kmxconfigs intotrailmix/kmx_out/(gitignored) via theemit_test_ec_add_*binaries. - Builds the zenodo
native_fuzztool (the SP1-guest Simulator). - Runs it on each
.kmxover 9000 Fiat-Shamir shots, asserting the circuit stays under its<qubits> <toffoli> <ops>caps and computesP += Qcorrectly on every shot.
NUM_TESTS=<n> changes the shot count; CIRC_OPS_CAP (set automatically for the
large shrunken-PZ stream) bounds the op buffer. The harness needs the
zkp_ecc_zenodo/ sibling checkout. Toffoli are counted as executed CCX+CCZ
averaged across shots, the metric Google scores. The op stream is input-independent
(the classical point Q enters as runtime controls), so each circuit is built once
and the fuzzer reloads its own 9000 random cases.
The toolchain is implemented in-house. Full inventory with file:line
references in trailmix/notes/OVERVIEW.md; the
highlights:
- 64-shot parallel simulator, built into the builder. Each qubit is one
u64(one bit per shot); X/CX/CCX/Z/HMR are single bitwise ops across all 64 shots. The HMR RNG is seeded fromthread_rngonce per run (never a fixed seed), so every run exercises a different measurement pattern, and Toffoli are counted per fired shot (popcount(fire_mask)), exactly Google's average-executed-Toffoli metric. - Phase-lattice correctness gate (
src/tracker/phase_lattice.rs): a per-gate abstract interpreter (AbsValper qubit: Zero/One/CopyOf/AndOf/XorOf/...).prove_zerois the hard gate before any qubit free;assert_phase_cleanverifies every HMR phase obligation structurally cancels. It catches precondition violations the sim alone does not. - Time-travel debugger (
src/tracker/debugger/): gate-by-gate replay over the sim with full-state checkpoints and a per-opDeltaring buffer for O(1) backward steps. The canonical recipe iswatch q<N>plusrb(run backward to the op that broke an invariant) plussrc <op>(maps any op to its emittingfile:line). Auto-attaches on failure underDEBUG_ON_FAIL=1. - Ghost / spooky-pebble API (
src/tracker/ghost.rs): a#[must_use]Ghostcaptures an HMR's 64-shot mask and anchor;resolve_ghostsim-verifies the discharge and emits the cancelling phase fix-up. This is what lets shrunken-PZ ghost a 256-bit coordinate so it andlambdaare never both live. - Move-only qubit ownership and strict dealloc.
QRegis a linear RAII handle (no Copy/Clone); drop queues a free. A lowest-id free-pool packs densely; every free asserts last-touch > last-alloc (no wasteful retention) andprove_zero(no dirty free) at the exact site. - A redundant-op elider, inline ghost-aware
contract_checkinvariants, anemit_reverse_sincegeneric uncompute, streaming metrics on multi-hundred-million-op circuits, and assertion caps:CIRC_ASSERT_MAX_QUBIT_PEAKfires the instant live qubits first cross the cap, andCIRC_ASSERT_MAX_OPSprints a per-section cost breakdown.
The arithmetic primitives are organized by purpose under trailmix/src/arith/:
cuccaro (ripple adders), mcx (multi-controlled-X), compare, const_add,
shift, khattar_gidney (prefix-AND / increment / comparison kernel),
ripple_add, mod_arith (modular field ops), and schrottenloher/ (the dialog
inversion). The EC drivers live in trailmix/src/ec/, the reversible-divstep
inversion in trailmix/src/inversion/.
cd trailmix
# all cargo invocations go through the 8 GB memory-scoped wrapper:
./scripts/run_8g_scope.sh cargo build --release --lib
./scripts/run_8g_scope.sh cargo test --release --lib <filter> -- --test-threads=1
# headline end-to-end test (one shrunken-PZ EC-add, 64 random pairs, clean teardown):
./scripts/run_8g_scope.sh cargo test --release --lib ec_add_inplace_shrunken_pz_random_64See trailmix/notes/OVERVIEW.md for the
time-travel debugger and profiling workflow.


