Protocol-Aware DSL-Mediated UVM Testbench Generation
HAVEN generates complete, compilable UVM testbenches from a hardware specification alone. Its key insight: a Protocol-Aware Sequence DSL serves as the principled boundary between LLM-generated intent and deterministic code generation. LLMs produce structured JSON; safety-filtered codegen produces protocol-correct SystemVerilog.
| Path | Tool | Role | Components |
|---|---|---|---|
| Structural | Jinja2 Templates | Deterministic UVM structure | 10/12 components (incl. protocol drivers, monitors, BFMs) |
| Structural | DSL Codegen | Safety-filtered sequence generation | 7 step types → SystemVerilog |
| Structural | VCS + URG | Compile-check, simulation, coverage | Phases 5–7 |
| Semantic | LLM (gpt-5.2-codex) | Spec extraction, seq_item, subscriber, DSL JSON | Phases 0–4B |
| Structural | Bayesian Optimization | Constraint parameter tuning | Phase 7 distribution gaps |
| Structural | VC Formal | Dead code proof + exclusion | Phase 7 proven_dead |
The DSL JSON → Codegen boundary (Phase 4B → 4C) is the architectural firewall: LLMs never emit raw protocol-level SystemVerilog.
Stage 1 — Generation (SPEC only, no DUT) Stage 2 — Refinement (DUT required)
┌──────────────────────────────────────────┐ ┌────────────────────────────────────┐
│ Phase 0: Config Auto-Extraction (LLM)│ │ Phase 6: Simulation & Repair │
│ Phase 1: Spec Processing (LLM)│ │ (VCS compile+sim+fix) │
│ Phase 2: Architecture Planning (LLM)│ │ Phase 7: Coverage Improvement │
│ Phase 2B: Protocol Flow Extraction (LLM)│ │ (BO + VC Formal + LLM) │
│ Phase 3: Testbench Generation │ └────────────────────────────────────┘
│ (Template 10 + LLM 2) │
│ Phase 4B: DSL Sequence Gen (LLM→JSON)│
│ Phase 4C: DSL Codegen (JSON→SV) │
│ Phase 5: Compile-Check & Fix (VCS) │
└──────────────────────────────────────────┘
# Install
uv sync
# Set OpenAI API key
echo "OPENAI_API_KEY=sk-..." > .env
# Stage 1 — generate testbench from spec (Phase 0-5)
uv run haven hdl/alu
# Phase 0 only — auto-generate config
uv run haven hdl/alu --until 0
# Multiple designs
uv run haven hdl/aes hdl/spi hdl/uart
# Stage 2 — full pipeline with simulation (requires VCS)
uv run haven hdl/alu --until 7
# Resume from previous run
uv run haven --from output/20260227_001234_aes_core --until 7Best-of-3 total coverage per design (full pipeline, Phase 0–7):
| Protocol | Design | Total | Line | Cond | Tgl | Br | FSM |
|---|---|---|---|---|---|---|---|
| Direct | ALU | 94.5% | 95.2 | 93.4 | 97.1 | 92.3 | — |
| Direct | AES | 96.5% | 100.0 | 100.0 | 99.6 | 99.6 | 83.3 |
| Direct | SHA3 | 97.5% | 100.0 | 95.8 | 96.8 | 97.6 | — |
| Wishbone | SPI | 94.0% | 93.5 | 93.2 | 94.5 | 88.6 | 100.0 |
| Wishbone | Simple SPI | 88.5% | 93.0 | 93.0 | 82.5 | 90.8 | 83.3 |
| Wishbone | UART | 90.0% | 97.3 | 86.9 | 95.5 | 91.9 | 78.6 |
| Wishbone | I2C | 75.2% | 86.8 | 58.3 | 83.2 | 81.2 | 66.7 |
| Wishbone | GPIO | 83.3% | 99.4 | 71.5 | 87.1 | 75.2 | — |
| Wishbone | CAN | 88.8% | 93.2 | 81.4 | 89.9 | 90.8 | — |
| Wishbone | ETHMAC | 68.2% | 76.7 | 72.7 | 50.0 | 73.5 | — |
| Wishbone | SDRAM | 70.0% | 83.2 | 67.6 | 70.8 | 75.5 | 52.7 |
| AXI4-Lite | AXIL RAM | 82.0% | 100.0 | 58.1 | 78.3 | 91.7 | — |
| AXI4-Lite | UE GPIO | 94.2% | 98.3 | 86.4 | 97.5 | 94.5 | — |
| AXI4-Lite | UE SPI | 90.9% | 98.0 | 89.7 | 79.8 | 95.9 | — |
| AXI4-Lite | UE Timer | 93.8% | 98.4 | 83.3 | 98.5 | 95.1 | — |
| AXI4-Lite | UE UART | 83.8% | 94.3 | 83.0 | 71.8 | 86.1 | — |
Average: 87.0% | 8/16 > 90% | 100% compile success (48/48 runs)
The DSL is the flagship architectural contribution. LLMs produce sequence logic as JSON with 7 step types:
| Step Type | Purpose | Example |
|---|---|---|
register_write |
Write value to register address | Configure SPI control register |
register_read |
Read register, store result | Read status register |
poll |
Repeated read until condition | Wait for transfer complete |
randomize_send |
Constrained random stimulus | Random data with address constraints |
delay |
NOP cycles | Wait for internal processing |
memory_write |
Bulk transfer via BFM backdoor | Initialize SDRAM contents |
bfm_action |
Slave-side BFM configuration | Set SPI slave response data |
Safety filters in codegen (Phase 4C):
- Drops constraints on non-rand/output fields (prevents VCS
CNST-XZSVE) - Validates field references against seq_item declaration
- Hoists
init_stepsfor sequence self-containment
16 designs from OpenCores and Ultra-Embedded, grouped by bus protocol:
| Protocol | Designs | LOC Range | BFMs |
|---|---|---|---|
| Direct | ALU, AES, SHA3 | 180–586 | None |
| Wishbone | SPI, Simple SPI, UART, I2C, GPIO, CAN, ETHMAC, SDRAM | 352–11,154 | spi_slave, i2c_slave, mii_phy, sdram_model, wishbone_slave |
| AXI4-Lite | AXIL RAM, UE GPIO, UE SPI, UE Timer, UE UART | 150–800 | None |
Each design includes DPI-C reference models for scoreboard verification.
haven/
├── src/haven/ # Python package
│ ├── main.py # CLI entry point + pipeline orchestration
│ ├── graph/ # LangGraph state machine
│ │ ├── task_graph.py # Phase nodes + graph construction
│ │ ├── state.py # Pipeline state definition
│ │ └── router.py # Coverage iteration routing
│ ├── phases/ # Phase implementations
│ │ ├── config_generator.py # Phase 0: LLM spec extraction → config
│ │ ├── spec_processor.py # Phase 1: spec → structured JSON
│ │ ├── arch_planner.py # Phase 2: structured spec → UVM blueprint
│ │ ├── protocol_flow_extractor.py # Phase 2B: protocol flows + BFM configs
│ │ ├── testbench_generator.py # Phase 3: blueprint → UVM components
│ │ ├── dsl_generator.py # Phase 4B: operation flows → DSL JSON
│ │ ├── coverage_improver.py # Phase 7: gap classification + improvement
│ │ └── bo_optimizer.py # Bayesian Optimization engine
│ ├── dsl/ # Protocol-aware sequence DSL
│ │ ├── schema.py # 7 step types + BFMConfig
│ │ └── codegen.py # DSL JSON → SystemVerilog codegen
│ ├── templates/ # Jinja2 UVM templates (18 .sv.j2 files)
│ │ ├── driver_wishbone_master.sv.j2 # Wishbone driver template
│ │ ├── driver_direct.sv.j2 # Direct-drive driver template
│ │ ├── monitor_wishbone.sv.j2 # Wishbone monitor
│ │ ├── monitor_direct.sv.j2 # Direct-drive monitor
│ │ ├── bfm_*.sv.j2 # 5 BFM templates
│ │ └── ... # + interface, scoreboard, env, test, etc.
│ ├── eda/ # EDA tool wrappers (VCS, URG, VC Formal)
│ └── utils/ # Shared utilities
│
├── hdl/ # Benchmark suite (16 designs)
│ ├── {design}/rtl/ # Verilog source files
│ ├── {design}/spec/ # spec.md, impl_spec.md, vspec.md
│ └── {design}/tb/ # DPI-C reference model
│
├── output/ # Generated testbenches (timestamped)
├── haven.json # Global config (LLM, simulation, BO)
└── paper/ # ICCAD 2026 paper
Phase 0 auto-extracts per-design settings (clock, reset, static_signals, extra_resets, BFM configs) from the spec using the LLM. The only manual override typically needed is the DPI-C reference model path in hdl/{design}/haven.json.
Global config (haven.json):
{
"llm": {
"planning_model": "gpt-5.2",
"coding_model": "gpt-5.2-codex",
"temperature": 0.3
},
"simulation": {
"tool": "vcs",
"sim_retry_limit": 5,
"cov_retry_limit": 3,
"timeout_seconds": 300
},
"bayesian_opt": {
"max_iterations": 30,
"acquisition_function": "EI"
}
}- VCS (Synopsys): simulation + coverage + URG
- VC Formal (Synopsys, optional): dead code detection via FCA
- Shell:
tcshfor EDA tool invocation
source /usr/cad/synopsys/CIC/vcs.cshrc- Python 3.12+
langgraph,langchain-openai— LLM pipelinescikit-optimize— Bayesian Optimizationjinja2— UVM template renderingpython-dotenv— Environment config
MIT