NecoFuzz: Effective Fuzzing of Nested Virtualization via Fuzz-Harness Virtual Machines

NecoFuzz is a gray-box fuzzer specifically designed for testing nested virtualization functionality in hypervisors. This artifact enables reproduction of the experimental results presented in our paper.

Hardware Requirements

CPU: Intel processor with VT-x support OR AMD processor with AMD-V support
Platform: Bare metal machine (nested virtualization in VMs is not supported)
BIOS/UEFI: Virtualization features must be enabled in firmware settings

Software Dependencies

Install build essentials for Linux kernel, AFL++, Xen, and QEMU

sudo apt update
sudo apt install -y gcc-11 clang unzip libstdc++-12-dev libelf-dev libssl-dev dwarves build-essential git debootstrap pkg-config automake bison flex python3 python3-pip qemu-system-x86 qemu-kvm


# Check installed versions
gcc-11 --version
g++-11 --version
gcov-11 --version

# If gcc is not using version 11, switch using update-alternatives
if ! gcc --version | grep -q "11."; then
    sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 110
    sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-11 110
    sudo update-alternatives --config gcc
    sudo update-alternatives --config g++
fi

Setup Instructions

1. Initial Setup

git clone https://github.com/shina-lab/artifact_NecoFuzz
cd artifact_NecoFuzz
git submodule update --init --depth 1 --jobs 4 --progress
cp config/kvm_default.yaml config.yaml
make prepare

2. KVM Setup

./scripts/build_linux.sh
# Configure GRUB to boot the newly built Linux kernel
sudo grub-reboot <kernel-entry>
sudo reboot
# Verify the kernel version after reboot
uname -r

3. AFL++ Setup

cd external/AFLplusplus
make -j $(nproc)
cd ../..

If you encounter a compilation error, it is likely due to missing dependencies. Please refer to the official AFL++ installation guide: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md

4. QEMU Setup

cd external/qemu
patch -p1 < ../../patches/necofuzz_qemu.patch

# Follow standard QEMU build process
mkdir build
cd build/
../configure --target-list=x86_64-softmmu --extra-ldflags=-lelf
make -j $(nproc)
cd ../../../

Major Claims

C1. Main Effectiveness:

NecoFuzz achieves significantly higher nested virtualization code coverage compared to existing testing approaches (selftests, kvm-unit-tests, syzkaller). Proven by experiments E1-E5 (Figure 3, Table 2).

C2. Architecture Validity:

Each component of NecoFuzz's VM generator makes meaningful contributions to coverage improvement, with the VM state validator having the largest impact. Validates our system design and is proven by experiment E6 (Figure 4, Table 3).

C3. Generalizability:

NecoFuzz successfully operates on multiple hypervisors (KVM and Xen) and outperforms existing testing approaches on both platforms. Demonstrates broad applicability across different virtualization platforms, proven by experiments E7 (Xen, Table 4).

Running Experiments

These experiments are designed to evaluate the effectiveness of NecoFuzz, a novel fuzzer for hypervisor nested virtualization. We measure its code coverage on KVM and Xen and compare it against state-of-the-art fuzzers and developer-written tests to demonstrate its superiority.

E1. NecoFuzz KVM Coverage [30 human-minutes + 48 compute-hours]

Objective: Measure the code coverage achieved by NecoFuzz when fuzzing KVM nested virtualization over an extended period.

Preparation:

Prepare your environment:

cp config/kvm_default.yaml config.yaml
make prepare
make -C tools
./tools/scripts/kvm_baseline_coverage.sh
make kvm
./tools/scripts/rm_shm_coverage.sh

Execution: Run NecoFuzz with coverage monitoring for 48 hours.

# Terminal 1: Run the fuzzer
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh

Note: The fuzzer runs indefinitely. Manually stop the process in Terminal 1 with Ctrl-C after 48 hours.

Expected Results: Coverage data and a timeline showing NecoFuzz's progression. Coverage is expected to increase rapidly and then plateau around the 12-24 hour mark, while still achieving a higher final coverage than the baselines.

E2. Syzkaller Baseline [45 human-minutes + 48 compute-hours]

Objective: Establish a performance baseline by measuring the coverage achieved by Syzkaller, a state-of-the-art, general-purpose kernel fuzzer.

Preparation:

./scripts/build_syzkaller_linux.sh
./scripts/setup_syzkaller.sh
./scripts/test_syzkaller.sh

Execution: Run Syzkaller for 48 hours to ensure a fair comparison with NecoFuzz.

./scripts/run_syzkaller.sh out/syzkaller
# Coverage timeline will be generated in out/syzkaller/coverage_timeline.csv

Note: This script is configured for a long-duration run. Manually stop it with Ctrl-C after 48 hours.

Expected Results: A coverage timeline for Syzkaller, which is expected to show slower progression and a lower final coverage compared to NecoFuzz.

E3. KVM Selftests Baseline [15 human-minutes + 30 compute-minutes]

Objective: Measure the coverage of the official KVM developer test suite (selftests) to serve as a baseline.

Execution:

patch -p1 -d external/linux < patches/linux_selftests.patch
./scripts/run_kvm_selftests.sh out/kvm_selftests run
# Results will be in out/kvm_selftests/final_nested_coverage

Expected Results: Coverage data for Table 2. These developer-written tests are expected to show limited coverage of complex nested virtualization code paths.

E4. KVM Unit Tests Baseline [10 human-minutes + 20 compute-minutes]

Objective: Measure coverage achieved by kvm-unit-tests, another developer-centric test suite, for a comprehensive baseline comparison.

Execution:

./scripts/run_kvm-unit-tests.sh out/kvm_unit-tests
# Results will be in out/kvm_unit-tests/final_nested_coverage

Expected Results: Coverage data for Table 2, which is expected to demonstrate the limited scope of existing unit tests for nested virtualization.

E5. Results Analysis [15 human-minutes]

Objective: Consolidate the results from E1-E4 to generate the final comparison figures and tables.

Execution:

# Install required Python packages
pip install pandas
pip install matplotlib

# Generate Syzkaller vs NecoFuzz coverage graph (Figure 3)
python3 ./scripts/generate_kvm_syzkaller_graph.py
# Results will be generated in artifact/fig3.png

# Run full coverage analysis for (Table 2)
python3 ./scripts/generate_kvm_necofuzz_comparison.py
# Results will be generated in artifact/table2.csv

Expected Results:

Figure 3: A coverage-over-time graph comparing NecoFuzz and Syzkaller, visually demonstrating that NecoFuzz achieves higher coverage faster.
Table 2: A table with the final coverage numbers, quantitatively showing NecoFuzz's 1.4-2x improvement over all baseline methods.

E6. Component Contribution Analysis (Ablation Study) [45 human-minutes + 120 compute-hours]

Objective: Perform an ablation study to measure the individual contribution of each core NecoFuzz component by selectively disabling them.

Preparation: For each run, copy the corresponding configuration file to config.yaml. The variants are: with all components (from E1), without the VM execution harness, without the VM state validator, without the vCPU configurator, and without any components (baseline fuzzer).

Execution: Run each of the four configurations below for 24 hours.

Without ALL components (baseline):

# Terminal 1: Run the fuzzer
cp config/wo_all.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh

# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh

Without VM execution harness:

# Terminal 1: Run the fuzzer
cp config/wo_harness.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh

# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh

Without vCPU configurator:

# Terminal 1: Run the fuzzer
cp config/wo_vcpu_config.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh

Without VM state validator:

# Terminal 1: Run the fuzzer
cp config/wo_vmstate_validator.yaml config.yaml
./tools/scripts/rm_shm_coverage.sh
./tools/scripts/afl-runner.sh
# Terminal 2: Monitor coverage
./tools/scripts/monitor_record.sh

Note: Each afl-runner.sh command runs indefinitely. Monitor the time and manually stop each process with Ctrl-C after 24 hours. A coverage monitor can be run in a separate terminal for each experiment.

Generate Analysis:

python3 ./scripts/generate_kvm_necofuzz_componets_graph.py
# Results will be generated in artifact/fig4.png and artifact/table3.csv

Expected Results:

Figure 4: A set of coverage-over-time graphs showing that the full version of NecoFuzz outperforms all variants, with the VM state validator having the largest positive impact on coverage.
Table 3: A table of final coverage numbers demonstrating that all components contribute meaningfully.

E7. Xen Coverage [30 human-minutes + 24 compute-hours]

Objective: Demonstrate the generalizability of the NecoFuzz approach by applying it to the Xen hypervisor.

Preparation - Xen Setup:

./scripts/build_linux_for_xen.sh
./scripts/build_xen
sudo update-grub
# Configure GRUB to boot the Xen entry (e.g., 'Ubuntu, with Xen hypervisor and Linux 6.5.0-xen')
sudo grub-reboot <xen-entry-name>
sudo reboot
# After reboot, enable Xen services
sudo update-rc.d xencommons defaults 19 18
sudo update-rc.d xendomains defaults 21 20
echo "/usr/local/lib" | sudo tee /etc/ld.so.conf.d/xen.conf
sudo ldconfig
# Verify the Xen version to confirm setup
sudo xl info

# install jq for parse .json file
sudo apt install jq

Execution:

NecoFuzz on Xen (24 hours):

# Modify config.yaml to target Xen
cp config/xen_default.yaml config.yaml
./tools/scripts/afl-runner.sh

Note: For this NecoFuzz experiment, monitor the runtime and stop the process with Ctrl-C after exactly 24 hours.

Automation available: To auto-resume fuzzing after host reboots under Xen, use scripts/run_xen_necofuzz.sh (setup below). Since the host may occasionally crash before the full 24-hour fuzzing period completes, we recommend enabling this automation.

Optional (Xen): Auto-resume after reboots

Edit user crontab

crontab -e

Add the following entry (adjust XEN_GRUB_ENTRY and /full-path-to/ to your environment):

SHELL=/bin/bash
# Xen GRUB entry string (must match grub.cfg)
XEN_GRUB_ENTRY='Advanced options for Ubuntu GNU/Linux (with Xen hypervisor)>Xen hypervisor, version 4.18>Ubuntu GNU/Linux, with Xen 4.18 and Linux 6.5.0-xen+'

@reboot /full-path-to/artifact_NecoFuzz/scripts/run_xen_necofuzz.sh >> /full-path-to/artifact_NecoFuzz/necofuzz.log 2>&1

Edit sudoers for passwordless sudo

sudo visudo

Add the following line (replace <user> with your username, adjust /full-path-to/ and xl path to match your system):

<user> ALL=(ALL) NOPASSWD: /usr/sbin/grub-reboot, /usr/sbin/reboot, /usr/local/sbin/xl, /full-path-to/artifact_NecoFuzz/tools/scripts/afl-runner.sh

Operate

After reboot, check necofuzz.log for:
- [INFO] Xen detected, starting NecoFuzz...
- [INFO] Using AFL_SEED=...
To stop auto-resume for the next boots:
```
touch /var/tmp/xen_stop
```
To re-enable:
```
rm -f /var/tmp/xen_stop
```

Note Outputs created while auto-resume is active may be owned by root. Adjust if needed:

sudo chown -R <user>:<user> out/

XTF Baseline (Xen Test Framework):

./scripts/run_xtf.sh out/xen_xtf
# This typically completes in under 1 hour

Generate Analysis:

python3 ./scripts/generate_xen_necofuzz_comparison.py
# Results will be in artifact/table4.csv

Expected Results:

Table 4: A final coverage comparison showing NecoFuzz is also highly effective on Xen's nested virtualization code, outperforming the native XTF baseline. This demonstrates the cross-hypervisor applicability of the NecoFuzz approach.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
config		config
external		external
patches		patches
scripts		scripts
src		src
tools		tools
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
random_mutator.c		random_mutator.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NecoFuzz: Effective Fuzzing of Nested Virtualization via Fuzz-Harness Virtual Machines

Hardware Requirements

Software Dependencies

Setup Instructions

1. Initial Setup

2. KVM Setup

3. AFL++ Setup

4. QEMU Setup

Major Claims

C1. Main Effectiveness:

C2. Architecture Validity:

C3. Generalizability:

Running Experiments

E1. NecoFuzz KVM Coverage [30 human-minutes + 48 compute-hours]

E2. Syzkaller Baseline [45 human-minutes + 48 compute-hours]

E3. KVM Selftests Baseline [15 human-minutes + 30 compute-minutes]

E4. KVM Unit Tests Baseline [10 human-minutes + 20 compute-minutes]

E5. Results Analysis [15 human-minutes]

E6. Component Contribution Analysis (Ablation Study) [45 human-minutes + 120 compute-hours]

E7. Xen Coverage [30 human-minutes + 24 compute-hours]

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NecoFuzz: Effective Fuzzing of Nested Virtualization via Fuzz-Harness Virtual Machines

Hardware Requirements

Software Dependencies

Setup Instructions

1. Initial Setup

2. KVM Setup

3. AFL++ Setup

4. QEMU Setup

Major Claims

C1. Main Effectiveness:

C2. Architecture Validity:

C3. Generalizability:

Running Experiments

E1. NecoFuzz KVM Coverage [30 human-minutes + 48 compute-hours]

E2. Syzkaller Baseline [45 human-minutes + 48 compute-hours]

E3. KVM Selftests Baseline [15 human-minutes + 30 compute-minutes]

E4. KVM Unit Tests Baseline [10 human-minutes + 20 compute-minutes]

E5. Results Analysis [15 human-minutes]

E6. Component Contribution Analysis (Ablation Study) [45 human-minutes + 120 compute-hours]

E7. Xen Coverage [30 human-minutes + 24 compute-hours]

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages