Cottontail is a cutting-edge concolic execution engine powered by large language models (LLMs), designed for highly structured test input generation. Presented at IEEE Symposium on Security and Privacy (S&P) 2026, Cottontail advances the state-of-the-art in automated software testing by integrating expressive program path representation, LLM-driven constraint solving, and history-guided seed acquisition.
(A fun fact: the logo of Cottontail is generated by Google Gemini π).
- Expressive Coverage Tree (ECT): A high-level representation of structural program paths to help structure-aware path constraint selection for comprehensive program analysis.
- LLM-Driven Constraint Solver: Smartly solves path constraints to generate syntactically valid and path-constraint satisfiable test inputs.
- History-Guided Seed Acquisition: Efficiently discovers new, highly structured test inputs to maximize coverage, based on the naming convention of the functions.
- Sound Analysis: All test cases from LLM are validated to ensure that the resulting behaviors are sound, consistent with the underlying principles of concolic execution.
π Read the Paper
.
βββ LICENSE # Project license (GPLv3)
βββ README.md # Project documentation and usage
βββ Dockerfile # Docker build instructions for Cottontail
βββ docker-compose.yml # Docker Compose configuration (if multi-container)
βββ benchmark-test/ # Example benchmarks and test scripts
β βββ build-json-c.sh # Script to build the JSON-C benchmark
β βββ input/ # Input seeds for benchmarks
βββ config/
β βββ config.ini # Main configuration file
β βββ config.docker.ini # Docker-specific configuration
βββ cottontail-compiler/
β βββ build-cottontail-compiler.sh # Build script for native environment
β βββ build-cottontail-compiler-docker.sh # Build script for Docker
β βββ CMakeLists.txt # CMake build configuration
β βββ compiler/ # Compiler source code
β βββ docs/ # Compiler documentation
β βββ runtime/ # Runtime libraries for compiler
β βββ test/ # Compiler test cases
β βββ util/ # Compiler utilities
βββ docs/
β βββ cottontail-logo-focus.png # Project logo (focus version)
β βββ cottontail-logo.png # Project logo
β βββ overview.png # System design/architecture diagram
βββ scripts/
β βββ collect_coverage.sh # Script to collect code coverage
β βββ prepare.sh # Install dependencies and prepare environment
β βββ run-cottontail.py # Main entry point for concolic testing
- You may choose two approaches (
DockerorNative Env) to runCottontail.
To build the Docker image:
docker build -t cottontail:latest .To run the Docker container with your OpenAI API key and get an interactive bash shell (for debugging or manual commands):
docker run -it cottontail:latest bashReplace the YOUR_OPENAI_API_KEY_HERE with your key in the config.ini file, and it is ready to go:) :
python3 run-cottontail.py
After a few minutes, you will see the similar expected outputs mentioned below in 5. Launch Concolic Testing.
The prototype of Cottontail is running on the following environment.
- Ubuntu 18.04
- LLVM-11.0.0
- Python 3.9.16
- GCC-7.5
- Gcovr 8.3
Run the provided script to install all required dependencies (please feel free to comment out the packages such as z3 and llvm-11 are already installed in your system):
$./scripts/prepare.shCompile the customized cottontail-cc compiler:
$cd cottontail-compiler
$./build-cottontail-compiler.shNavigate to the benchmark directory and build the example:
$cd benchmark-test
$./build-json-c.shEdit config.ini to specify your GPT/DeepSeek Model, API key, and the testing folder:
gpt_model = gpt-4o-mini
api_key = YOUR_OPENAI_API_KEY_HERE
mainDir = xx/benchmark-test
Example configuration:
[common-settings]
format = JSON
llm_model = gpt-4o-mini
api_key = sk-xxx
[gcov-locations]
mainDir = xx/benchmark-test
gcovDir = %(maindir)s/json-c/build-gcov/apps/
sourceDir = %(maindir)s/json-c/
recordFile = %(maindir)s/json-c/build-gcov/all_records.txt
[running-locations]
mainDir = xx/benchmark-test
inputDir = %(maindir)s/input
outputDir = %(maindir)s/output
failedDir = %(maindir)s/failed-cases
[running-targets]
mainDir = xx/benchmark-test
cottontailTarget = %(maindir)s/json-c/build-cottontail/apps/json_parse
gcovTarget = %(maindir)s/json-c/build-gcov/apps/json_parse
[running-params]
timeout = 43200 // running timeout
cov_timeout = 60 // interval timeout for coverage collection
You are ready to go. Start the concolic testing process by running:
$pwd
xxx/benchmark-test
$python run-cottontail.pyThe expected output would be
Recreated folder: output
Removed *.gcda files in: benchmark-test/json-c
Constructed path: benchmark-test/json-c/build-gcov/apps/all_records.txt
Importing benchmark-test/input/seed from the input directory
++++++++++++Generation 0 ++++++++++++
Running on /tmp/tmpvfngz8zh/cur/seed
GPT solving running
Now concolic executing /tmp/tmpvfngz8zh/cur/seed
Content:
/tmp/tmpvfngz8zh/cur/seed
model : gpt-4o-mini
Keys in path-constraints-expr.json have been simplified and saved.
total_pc_cnt = 101
total_pc_cnt = 101
+++ LOG: {'executed file': '/tmp/tmp_nk9rl2i/cur/seed', 'duration': 0.018297195434570312, 'gpt_invocation_cnt': '1', 'gpt_solving_time': '1.192103624343872', 'gpt_seed_generation_cnt': '0', 'gpt_history_update_cnt': '0', 'total_pc_cnt': '101', 'gpt_generated_testcases_cnt': '1', 'gpt_missed_testcases_cnt': '0', 'valid_solving_cnt': '1', 'invalid_solving_cnt': '0', 'is_not_interesting_cnt': '0', 'prompt_token_cnt': '1316'}
+++ LOG: {'executed file': '/tmp/tmp_nk9rl2i/cur/seed', 'duration': 11.306709051132202, 'gpt_invocation_cnt': '2', 'gpt_solving_time': '0.7505309581756592', 'gpt_seed_generation_cnt': '0', 'gpt_history_update_cnt': '0', 'total_pc_cnt': '101', 'gpt_generated_testcases_cnt': '2', 'gpt_missed_testcases_cnt': '0', 'valid_solving_cnt': '1', 'invalid_solving_cnt': '1', 'is_not_interesting_cnt': '0', 'prompt_token_cnt': '2684'}
...
After a few minutes, you will see the coverage report
$ cat benchmark-test/json-c/build-gcov/apps/all_records.txt
lines: 17.6% (817 out of 4640) functions: 28.2% (85 out of 301) branches: 14.6% (403 out of 2759) Fri Nov 28 11:15:18 +08 2025
lines: 18.3% (847 out of 4640) functions: 28.2% (85 out of 301) branches: 15.5% (428 out of 2759) Fri Nov 28 11:16:21 +08 2025
lines: 19.6% (909 out of 4640) functions: 30.9% (93 out of 301) branches: 16.7% (460 out of 2759) Fri Nov 28 11:16:43 +08 2025
lines: 19.6% (909 out of 4640) functions: 30.9% (93 out of 301) branches: 16.7% (461 out of 2759) Fri Nov 28 11:17:24 +08 2025
lines: 19.6% (911 out of 4640) functions: 30.9% (93 out of 301) branches: 16.9% (465 out of 2759) Fri Nov 28 11:18:27 +08 2025
Please check benchmarks for other subjects used in the paper.
We gratefully acknowledge the creators of SymCC, MuJS, and QuickJS for their foundational work and swift responses to our feedback. We also thank the anonymous reviewers for their valuable insights.
Cottontail is released under the GNU General Public License v3.0, following SymCC.
You are free to copy, modify, and distribute this software under the terms of the GPLv3.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the LICENSE file for the full text.
If you use Cottontail in your research, please cite:
@inproceedings{cottontail-sp26,
author={Tu, Haoxin and Lee, Seongmin and Li, Yuxian and Chen, Peng and Jiang, Lingxiao and BΓΆhme, Marcel},
title={{Cottontail: Large Language Model-Driven Concolic Execution for Highly Structured Test Input Generation}},
booktitle={2026 IEEE Symposium on Security and Privacy (SP)},
ISSN = {2375-1207},
pages = {2064-2082},
year={2026},
doi = {10.1109/SP63933.2026.00110},
url = {https://doi.ieeecomputersociety.org/10.1109/SP63933.2026.00110},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
}
If you have any questions about the current project or want to collaborate in the future, please file issues or reach out to haoxintu@gmail.com.

