Best Practices for Large Circuits
This document summarizes the best practices for compiling and generating Groth16 proofs for large ZK circuits using the circom / snarkjs toolstack. These techniques are most applicable to circuits with at least 20M constraints.
For such large circuits, you need a machine with an Intel processor, lots of RAM and a large hard drive with swap enabled. For example, the zkPairing project used an AWS r5.8xlarge instance with 32-core 3.1GHz, 256G RAM machine with 1T hard drive and 400G swap.
Our knowledge of the following best practices is almost entirely due to the generosity and guidance of Jordi Baylina from Polygon-Hermez.
Compilation and proving
- Compilation: for circuits with >20M constraints, one should not compile to WebAssembly because witness generation will exceed the memory cap of WebAssembly. For this reason, one must compile with the C++ flag and remove the
wasm flag.
- For witness debugging, run:
circom --O1 --c --sym (turns off .wasm and .r1cs). We are not concerned with generating a proving key, so the r1cs file is unnecessary. --O1 optimization only removes “equals” constraints but does not optimize out “linear” constraints.
- For production, run:
circom --O2 --c --sym --r1cs (turns off .wasm). In practice, one may still need to use --O1 because the further --O2 optimization takes significantly longer on large circuits (for reasons that aren’t totally clear).
- Witness generation: As mentioned above, witness generation must be done by compiling from C++ code.
- For C++, on Ubuntu one needs the following
apt packages: build-essential libgmp-dev libsodium-dev nasm nlohmann-json3-dev
- Update Circom to avoid this bug
- To build the binary:
cd "$CIRCUIT_NAME"_cpp; make
- To generate the witness:
./"$CIRCUIT_NAME" [input.json] [witness.wtns]
- Extract
.json from .wtns using snarkjs wej [witness.wtns] [witness.json] (uses .sym file)
wasm witness generator will not work for circuits above a certain constraint size (~10-20M) due to memory limit
- Note: C++ witness generator will not work on computes with Apple Silicon (e.g., M1) chips due to assembly incompatibility.
- Key generation: (see full commands below) Groth16 requires a separate trusted ceremony for each circuit – this is the phase 2 trusted setup. This step requires using the Powers of Tau from the phase 1 trusted setup and performing elliptic curve operations which scale with the size of the circuit. Unfortunately this means that the amount of memory used also scales with the size of the circuit (number of constraints).
- The speed of the phase 2 trusted setup is significantly slowed down due to automatic garbage collection performed by Node. To get around this, we use a patched version of Node introduced by Polygon-Hermez to disable garbage collection. This significantly speeds up the trusted setup; as a trade-off, Node uses an incredible amount of memory so the machine must have swap set up.
- Remove Node internal memory limit and also any system memory limits. (Commands below.)
- Add swap and patched version of Node without garbage collection.
- Proof generation: For faster proving time, use rapidsnark for proving.
rapidsnark/build/prover [.zkey] [.wtns] proof.json public.json
Witness generation debugging
In the circuit debugging stage, it is useful to note that you do not need to go through the full setup with key generation above to extract the outputs (if any) of the proof.
After generating the witness file witness.wtns and converting it to json using snarkjs wej [witness.wtns] [witness.json], then indices 1-m of witness.json (index 0 is always equal to 1) will contain the m public outputs of the proof.
In fact, one can in theory extract all witnesses from intermediate steps of the proof from witness.json using the .sym file. We have built an experimental Python parser to do this here (the parser currently may break due to compiler optimizations, run with --O0 for safety).
Setup from scratch
Here are the steps to set up a blank slate machine/instance according to the configuration described above.
Install rust, circom, C++ dependencies, nvm, and yarn.
curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
source $HOME/.cargo/env
git clone https://github.com/iden3/circom.git
cd circom
cargo build --release
cargo install --path circom
sudo apt install build-essential libgmp-dev libsodium-dev nasm nlohmann-json3-dev
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion" # This loads nvm bash_completion
nvm install --lts
npm install --global yarn
Remove system memory limit
Run
sysctl -w vm.max_map_count=655300
and fix it to not be reset after a reboot by adding this line
in the file /etc/sysctl.conf.
Setup swap
sudo fallocate -l 400G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
sudo sh -c 'echo "vm.max_map_count=10000000" >>/etc/sysctl.conf'
sudo sh -c 'echo 10000000 > /proc/sys/vm/max_map_count'
to make this persistent through reboots, add to /etc/fstab:
/swapfile swap swap defaults 0 0
Install patched node
We use $HOME_DIR as our home directory throughout.
cd $HOME_DIR
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.2/install.sh | bash
source ~/.bashrc
nvm install v14.8.0
node --version
git clone https://github.com/nodejs/node.git
cd node
git checkout 8beef5eeb82425b13d447b50beafb04ece7f91b1
patch -p1 <<EOL
index 0097683120..d35fd6e68d 100644
--- a/deps/v8/src/api/api.cc
+++ b/deps/v8/src/api/api.cc
@@ -7986,7 +7986,7 @@ void BigInt::ToWordsArray(int* sign_bit, int* word_count,
void Isolate::ReportExternalAllocationLimitReached() {
i::Heap* heap = reinterpret_cast<i::Isolate*>(this)->heap();
if (heap->gc_state() != i::Heap::NOT_IN_GC) return;
- heap->ReportExternalMemoryPressure();
+ // heap->ReportExternalMemoryPressure();
}
HeapProfiler* Isolate::GetHeapProfiler() {
diff --git a/deps/v8/src/objects/backing-store.cc b/deps/v8/src/objects/backing-store.cc
index bd9f39b7d3..c7d7e58ef3 100644
--- a/deps/v8/src/objects/backing-store.cc
+++ b/deps/v8/src/objects/backing-store.cc
@@ -34,7 +34,7 @@ constexpr bool kUseGuardRegions = false;
// address space limits needs to be smaller.
constexpr size_t kAddressSpaceLimit = 0x8000000000L; // 512 GiB
#elif V8_TARGET_ARCH_64_BIT
-constexpr size_t kAddressSpaceLimit = 0x10100000000L; // 1 TiB + 4 GiB
+constexpr size_t kAddressSpaceLimit = 0x40100000000L; // 4 TiB + 4 GiB
#else
constexpr size_t kAddressSpaceLimit = 0xC0000000; // 3 GiB
#endif
EOL
./configure
make -j16
The patched node executable is located at NODE_PATH = $HOME_DIR/node/out/Release/node.
Install snarkjs from source
cd $HOME_DIR
git clone https://github.com/iden3/snarkjs.git
cd snarkjs
git checkout v0.3.59
npm install
The snarkjs executable is located at SNARKJS_PATH = $HOME_DIR/snarkjs/cli.js.
Install rapidsnark from source
cd $HOME_DIR
git clone git@github.com:iden3/rapidsnark.git
cd rapidsnark
npm install
git submodule init
git submodule update
npx task createFieldSources
npx task buildProver
The rapidsnark executable is located at RAPIDSNARK_PATH = $HOME_DIR/rapidsnark/build/prover.
Build scripts
One can use the following bash script to implement all the proving steps described above. (For a full implementation, see here.)
The Powers of Tau file is located at $PHASE1. Let CIRCUIT_NAME be the name of the circuit. We assume the circuit has already been compiled, with all relevant files in the current directory.
Phase 2 trusted setup
Groth16 requires a separate trusted setup for each circuit. This generates a common reference string (CRS), which is stored in a .zkey file. The following commands should be run once per circuit.
To create the .zkey without phase 2 contributions:
echo "****GENERATING ZKEY 0****"
start=`date +%s`
$NODE_PATH --trace-gc --trace-gc-ignore-scavenger --max-old-space-size=2048000 --initial-old-space-size=2048000 --no-global-gc-scheduling --no-incremental-marking --max-semi-space-size=1024 --initial-heap-size=2048000 --expose-gc $SNARKJS_PATH zkey new "$CIRCUIT_NAME".r1cs "$PHASE1" "$CIRCUIT_NAME"_0.zkey -v
end=`date +%s`
echo "DONE ($((end-start))s)"
We should contribute to the phase 2 ceremony, which requires some randomn input. (For production, one should do multiple contributions with more rigor.)
echo "****CONTRIBUTE TO PHASE 2 CEREMONY****"
start=`date +%s`
$NODE_PATH $SNARKJS_PATH zkey contribute -verbose "$CIRCUIT_NAME"_0.zkey "$CIRCUIT_NAME".zkey -n="First phase2 contribution" -e="some random text for entropy"
end=`date +%s`
echo "DONE ($((end-start))s)"
Verify final zkey:
echo "****VERIFYING FINAL ZKEY****"
start=`date +%s`
$NODE_PATH --trace-gc --trace-gc-ignore-scavenger --max-old-space-size=2048000 --initial-old-space-size=2048000 --no-global-gc-scheduling --no-incremental-marking --max-semi-space-size=1024 --initial-heap-size=2048000 --expose-gc $SNARKJS_PATH zkey verify -verbose "$CIRCUIT_NAME".r1cs "$PHASE1" "$CIRCUIT_NAME".zkey
end=`date +%s`
echo "DONE ($((end-start))s)
The verifier does not need the full zkey to verify a Groth16 proof. They only need a shorter verification key. To export the verification key:
echo "****EXPORTING VKEY****"
start=`date +%s`
$NODE_PATH $SNARKJS_PATH zkey export verificationkey "$CIRCUIT_NAME".zkey [vkey.json] -v
end=`date +%s`
echo "DONE ($((end-start))s)"
Witness and proof generation
The following commands should be run once for each input to generate witness and a proof for that input.
Done by prover:
Generate witness (C++):
cd "$CIRCUIT_NAME"_cpp
make
./"$CIRCUIT_NAME" [input.json] [witness.wtns]
Change witness to .json:
snarkjs wej [witness.wtns] [witness.json]
Generate proof:
echo "****GENERATING PROOF FOR SAMPLE INPUT****"
start=`date +%s`
$RAPIDSNARK_PATH "$CIRCUIT_NAME".zkey [witness.wtns] [proof.json] [public.json]
end=`date +%s`
echo "DONE ($((end-start))s)"
Done by verifier:
To verify the proof, run:
echo "****VERIFYING PROOF FOR SAMPLE INPUT****"
start=`date +%s`
$NODE_PATH $SNARKJS_PATH groth16 verify [vkey.json] [public.json] [proof.json] -v
end=`date +%s`
echo "DONE ($((end-start))s)"
Best Practices for Large Circuits
This document summarizes the best practices for compiling and generating Groth16 proofs for large ZK circuits using the circom / snarkjs toolstack. These techniques are most applicable to circuits with at least 20M constraints.
For such large circuits, you need a machine with an Intel processor, lots of RAM and a large hard drive with swap enabled. For example, the zkPairing project used an AWS r5.8xlarge instance with 32-core 3.1GHz, 256G RAM machine with 1T hard drive and 400G swap.
Our knowledge of the following best practices is almost entirely due to the generosity and guidance of Jordi Baylina from Polygon-Hermez.
Compilation and proving
wasmflag.circom --O1 --c --sym(turns off.wasmand.r1cs). We are not concerned with generating a proving key, so ther1csfile is unnecessary.--O1optimization only removes “equals” constraints but does not optimize out “linear” constraints.circom --O2 --c --sym --r1cs(turns off.wasm). In practice, one may still need to use--O1because the further--O2optimization takes significantly longer on large circuits (for reasons that aren’t totally clear).aptpackages:build-essential libgmp-dev libsodium-dev nasm nlohmann-json3-devcd "$CIRCUIT_NAME"_cpp; make./"$CIRCUIT_NAME" [input.json] [witness.wtns].jsonfrom.wtnsusingsnarkjs wej [witness.wtns] [witness.json](uses.symfile)wasmwitness generator will not work for circuits above a certain constraint size (~10-20M) due to memory limitWitness generation debugging
In the circuit debugging stage, it is useful to note that you do not need to go through the full setup with key generation above to extract the outputs (if any) of the proof.
After generating the witness file
witness.wtnsand converting it tojsonusingsnarkjs wej [witness.wtns] [witness.json], then indices1-mofwitness.json(index0is always equal to1) will contain thempublic outputs of the proof.In fact, one can in theory extract all witnesses from intermediate steps of the proof from
witness.jsonusing the.symfile. We have built an experimental Python parser to do this here (the parser currently may break due to compiler optimizations, run with--O0for safety).Setup from scratch
Here are the steps to set up a blank slate machine/instance according to the configuration described above.
Install rust, circom, C++ dependencies, nvm, and yarn.
Remove system memory limit
Run
and fix it to not be reset after a reboot by adding this line
in the file
/etc/sysctl.conf.Setup swap
to make this persistent through reboots, add to
/etc/fstab:Install patched node
We use
$HOME_DIRas our home directory throughout.The patched node executable is located at
NODE_PATH = $HOME_DIR/node/out/Release/node.Install snarkjs from source
The snarkjs executable is located at
SNARKJS_PATH = $HOME_DIR/snarkjs/cli.js.Install rapidsnark from source
The rapidsnark executable is located at
RAPIDSNARK_PATH = $HOME_DIR/rapidsnark/build/prover.Build scripts
One can use the following bash script to implement all the proving steps described above. (For a full implementation, see here.)
The Powers of Tau file is located at
$PHASE1. LetCIRCUIT_NAMEbe the name of the circuit. We assume the circuit has already been compiled, with all relevant files in the current directory.Phase 2 trusted setup
Groth16 requires a separate trusted setup for each circuit. This generates a common reference string (CRS), which is stored in a
.zkeyfile. The following commands should be run once per circuit.To create the
.zkeywithout phase 2 contributions:We should contribute to the phase 2 ceremony, which requires some randomn input. (For production, one should do multiple contributions with more rigor.)
Verify final zkey:
The verifier does not need the full zkey to verify a Groth16 proof. They only need a shorter verification key. To export the verification key:
Witness and proof generation
The following commands should be run once for each input to generate witness and a proof for that input.
Done by prover:
Generate witness (C++):
Change witness to
.json:Generate proof:
Done by verifier:
To verify the proof, run: