NecoFuzz is a gray-box fuzzer specifically designed for testing nested virtualization functionality in hypervisors. This artifact enables reproduction of the experimental results presented in our paper.
- CPU: Intel processor with VT-x support OR AMD processor with AMD-V support
- Platform: Bare metal machine (nested virtualization in VMs is not supported)
- BIOS/UEFI: Virtualization features must be enabled in firmware settings
- Install build essentials for Linux kernel, AFL++, Xen, and QEMU
sudo apt update
sudo apt install -y gcc-11 clang unzip libstdc++-12-dev libelf-dev libssl-dev dwarves build-essential git debootstrap pkg-config automake bison flex python3 python3-pip qemu-system-x86 qemu-kvm
# Check installed versions
gcc-11 --version
g++-11 --version
gcov-11 --version
# If gcc is not using version 11, switch using update-alternatives
if ! gcc --version | grep -q "11."; then
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 110
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-11 110
sudo update-alternatives --config gcc
sudo update-alternatives --config g++
figit clone https://github.com/shina-lab/artifact_NecoFuzz
cd artifact_NecoFuzz
git submodule update --init --depth 1 --jobs 4 --progress
cp config/kvm_default.yaml config.yaml
make prepare./scripts/build_linux.sh
# Configure GRUB to boot the newly built Linux kernel
sudo grub-reboot <kernel-entry>
sudo reboot
# Verify the kernel version after reboot
uname -rcd external/AFLplusplus
make -j $(nproc)
cd ../..If you encounter a compilation error, it is likely due to missing dependencies. Please refer to the official AFL++ installation guide: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md
cd external/qemu
patch -p1 < ../../patches/necofuzz_qemu.patch
# Follow standard QEMU build process
mkdir build
cd build/
../configure --target-list=x86_64-softmmu --extra-ldflags=-lelf
make -j $(nproc)
cd ../../../NecoFuzz achieves significantly higher nested virtualization code coverage compared to existing testing approaches (selftests, kvm-unit-tests, syzkaller). Proven by experiments E1-E5 (Figure 3, Table 2).
Each component of NecoFuzz's VM generator makes meaningful contributions to coverage improvement, with the VM state validator having the largest impact. Validates our system design and is proven by experiment E6 (Figure 4, Table 3).
NecoFuzz successfully operates on multiple hypervisors (KVM and Xen) and outperforms existing testing approaches on both platforms. Demonstrates broad applicability across different virtualization platforms, proven by experiments E7 (Xen, Table 4).
These experiments are designed to evaluate the effectiveness of NecoFuzz, a novel fuzzer for hypervisor nested virtualization. We measure its code coverage on KVM and Xen and compare it against state-of-the-art fuzzers and developer-written tests to demonstrate its superiority.
Objective: Measure the code coverage achieved by NecoFuzz when fuzzing KVM nested virtualization over an extended period.
Preparation:
- Prepare your environment:
cp config/kvm_default.yaml config.yaml
make prepare
make -C tools
./tools/scripts/kvm_baseline_coverage.sh
make kvm
./tools/scripts/rm_shm_coverage.shExecution: Run NecoFuzz with coverage monitoring for 48 hours.
# Terminal 1: Run the fuzzer
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.shNote: The fuzzer runs indefinitely. Manually stop the process in Terminal 1 with Ctrl-C after 48 hours.
Expected Results: Coverage data and a timeline showing NecoFuzz's progression. Coverage is expected to increase rapidly and then plateau around the 12-24 hour mark, while still achieving a higher final coverage than the baselines.
Objective: Establish a performance baseline by measuring the coverage achieved by Syzkaller, a state-of-the-art, general-purpose kernel fuzzer.
Preparation:
./scripts/build_syzkaller_linux.sh
./scripts/setup_syzkaller.sh
./scripts/test_syzkaller.shExecution: Run Syzkaller for 48 hours to ensure a fair comparison with NecoFuzz.
./scripts/run_syzkaller.sh out/syzkaller
# Coverage timeline will be generated in out/syzkaller/coverage_timeline.csvNote: This script is configured for a long-duration run. Manually stop it with Ctrl-C after 48 hours.
Expected Results: A coverage timeline for Syzkaller, which is expected to show slower progression and a lower final coverage compared to NecoFuzz.
Objective: Measure the coverage of the official KVM developer test suite (selftests) to serve as a baseline.
Execution:
patch -p1 -d external/linux < patches/linux_selftests.patch
./scripts/run_kvm_selftests.sh out/kvm_selftests run
# Results will be in out/kvm_selftests/final_nested_coverageExpected Results: Coverage data for Table 2. These developer-written tests are expected to show limited coverage of complex nested virtualization code paths.
Objective: Measure coverage achieved by kvm-unit-tests, another developer-centric test suite, for a comprehensive baseline comparison.
Execution:
./scripts/run_kvm-unit-tests.sh out/kvm_unit-tests
# Results will be in out/kvm_unit-tests/final_nested_coverageExpected Results: Coverage data for Table 2, which is expected to demonstrate the limited scope of existing unit tests for nested virtualization.
Objective: Consolidate the results from E1-E4 to generate the final comparison figures and tables.
Execution:
# Install required Python packages
pip install pandas
pip install matplotlib
# Generate Syzkaller vs NecoFuzz coverage graph (Figure 3)
python3 ./scripts/generate_kvm_syzkaller_graph.py
# Results will be generated in artifact/fig3.png
# Run full coverage analysis for (Table 2)
python3 ./scripts/generate_kvm_necofuzz_comparison.py
# Results will be generated in artifact/table2.csvExpected Results:
- Figure 3: A coverage-over-time graph comparing NecoFuzz and Syzkaller, visually demonstrating that NecoFuzz achieves higher coverage faster.
- Table 2: A table with the final coverage numbers, quantitatively showing NecoFuzz's 1.4-2x improvement over all baseline methods.
Objective: Perform an ablation study to measure the individual contribution of each core NecoFuzz component by selectively disabling them.
Preparation:
For each run, copy the corresponding configuration file to config.yaml. The variants are: with all components (from E1), without the VM execution harness, without the VM state validator, without the vCPU configurator, and without any components (baseline fuzzer).
Execution: Run each of the four configurations below for 24 hours.
- Without ALL components (baseline):
# Terminal 1: Run the fuzzer
cp config/wo_all.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh- Without VM execution harness:
# Terminal 1: Run the fuzzer
cp config/wo_harness.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh- Without vCPU configurator:
# Terminal 1: Run the fuzzer
cp config/wo_vcpu_config.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh- Without VM state validator:
# Terminal 1: Run the fuzzer
cp config/wo_vmstate_validator.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.shNote: Each afl-runner.sh command runs indefinitely. Monitor the time and manually stop each process with Ctrl-C after 24 hours. A coverage monitor can be run in a separate terminal for each experiment.
- Generate Analysis:
python3 ./scripts/generate_kvm_necofuzz_componets_graph.py
# Results will be generated in artifact/fig4.png and artifact/table3.csvExpected Results:
- Figure 4: A set of coverage-over-time graphs showing that the full version of NecoFuzz outperforms all variants, with the VM state validator having the largest positive impact on coverage.
- Table 3: A table of final coverage numbers demonstrating that all components contribute meaningfully.
Objective: Demonstrate the generalizability of the NecoFuzz approach by applying it to the Xen hypervisor.
Preparation - Xen Setup:
./scripts/build_linux_for_xen.sh
./scripts/build_xen
sudo update-grub
# Configure GRUB to boot the Xen entry (e.g., 'Ubuntu, with Xen hypervisor and Linux 6.5.0-xen')
sudo grub-reboot <xen-entry-name>
sudo reboot
# After reboot, enable Xen services
sudo update-rc.d xencommons defaults 19 18
sudo update-rc.d xendomains defaults 21 20
echo "/usr/local/lib" | sudo tee /etc/ld.so.conf.d/xen.conf
sudo ldconfig
# Verify the Xen version to confirm setup
sudo xl info
# install jq for parse .json file
sudo apt install jqExecution:
- NecoFuzz on Xen (24 hours):
# Modify config.yaml to target Xen
cp config/xen_default.yaml config.yaml
./tools/scripts/afl-runner.shNote: For this NecoFuzz experiment, monitor the runtime and stop the process with Ctrl-C after exactly 24 hours.
Automation available: To auto-resume fuzzing after host reboots under Xen, use
scripts/run_xen_necofuzz.sh(setup below). Since the host may occasionally crash before the full 24-hour fuzzing period completes, we recommend enabling this automation.
Optional (Xen): Auto-resume after reboots
- Edit user crontab
crontab -eAdd the following entry (adjust XEN_GRUB_ENTRY and /full-path-to/ to your environment):
SHELL=/bin/bash
# Xen GRUB entry string (must match grub.cfg)
XEN_GRUB_ENTRY='Advanced options for Ubuntu GNU/Linux (with Xen hypervisor)>Xen hypervisor, version 4.18>Ubuntu GNU/Linux, with Xen 4.18 and Linux 6.5.0-xen+'
@reboot /full-path-to/artifact_NecoFuzz/scripts/run_xen_necofuzz.sh >> /full-path-to/artifact_NecoFuzz/necofuzz.log 2>&1- Edit sudoers for passwordless sudo
sudo visudoAdd the following line (replace <user> with your username, adjust /full-path-to/ and xl path to match your system):
<user> ALL=(ALL) NOPASSWD: /usr/sbin/grub-reboot, /usr/sbin/reboot, /usr/local/sbin/xl, /full-path-to/artifact_NecoFuzz/tools/scripts/afl-runner.sh
- Operate
- After reboot, check
necofuzz.logfor:[INFO] Xen detected, starting NecoFuzz...[INFO] Using AFL_SEED=...
- To stop auto-resume for the next boots:
To re-enable:
touch /var/tmp/xen_stop
rm -f /var/tmp/xen_stop
- Note
Outputs created while auto-resume is active may be owned by
root. Adjust if needed:
sudo chown -R <user>:<user> out/- XTF Baseline (Xen Test Framework):
./scripts/run_xtf.sh out/xen_xtf
# This typically completes in under 1 hour- Generate Analysis:
python3 ./scripts/generate_xen_necofuzz_comparison.py
# Results will be in artifact/table4.csvExpected Results:
- Table 4: A final coverage comparison showing NecoFuzz is also highly effective on Xen's nested virtualization code, outperforming the native XTF baseline. This demonstrates the cross-hypervisor applicability of the NecoFuzz approach.