A MIPS32 CPU with out-of-order execution, branch prediction, and L1 cache, implemented in SystemVerilog.
-
Out-of-Order Execution
- Tomasulo-style reservation stations
- Reorder buffer (ROB) for in-order commit
- Register renaming with physical register file
- Common data bus (CDB) for result broadcast
-
Branch Prediction
- Branch target buffer (BTB)
- 2-bit saturating counter predictor
- Return address stack (RAS)
-
Memory Subsystem
- L1 instruction cache
- L1 data cache (write-back)
- Load/store queues with memory disambiguation
- Store-to-load forwarding
-
Recovery
- Precise exception handling
- Branch misprediction recovery
- Memory violation recovery
Install the required toolchains using Homebrew:
brew install verilator llvm@17
echo 'export PATH="/opt/homebrew/opt/llvm@17/bin:$PATH"' >> ~/.zshrc
source ~/.zshrcFor better development experience, add the following paths to includePath in VSCode (paths may vary based on your installed versions):
/opt/homebrew/Cellar/verilator/<version>/share/verilator/include
/opt/homebrew/Cellar/verilator/<version>/share/verilator/include/vltstd
Install the required toolchains:
sudo apt install verilator llvm-17
sudo apt install gcc-mips-linux-gnu binutils-mips-linux-gnumake allNavigate to ./tests and run:
cd tests && make allThe test Makefiles automatically detect the platform and use the appropriate toolchain:
- macOS: Uses LLVM/Clang for cross-compilation
- Linux: Uses GCC MIPS cross-compiler
Run all tests with:
make testExpected output:
[OK] TEST1-JUMP
[OK] TEST2-BITWISE
[OK] TEST3-IMM
[OK] TEST4-OPs
[OK] TEST5-LOAD_STORE
[OK] TEST6-BRANCH
[OK] TEST7-LUI
[OK] TEST8-QSORT
[OK] TEST9-OOO_DEPS
[OK] TEST10-OOO_MEM
[OK] TEST11-OOO_BRANCH
[OK] TEST12-MATRIX_MULT
[OK] TEST13-PRIME_SIEVE
[OK] TEST14-FIB_MEMO
[OK] TEST15-BINARY_SEARCH
[OK] TEST16-LINKED_LIST
ACCEPTED.
The test suite includes:
- Unit tests (1-7): Basic instruction verification
- Algorithm tests (8, 12-16): Complex workloads including quicksort, matrix multiplication, prime sieve, Fibonacci, binary search, and linked list operations
- OoO-specific tests (9-11): Dependency chains, memory disambiguation, and branch prediction
The test harness reports branch prediction statistics and cycle counts for each test.
To debug the CPU step by step:
make runDebug commands:
n- Execute next instructionr- Run until halt or breakpointb 0x<addr>- Set breakpointp- Print registersq- Quit
You can change the target image in the Makefile under the root folder.
The CPU implements a superscalar out-of-order pipeline with the following stages:
- Fetch - Instruction fetch with branch prediction (BTB + RAS)
- Decode - Instruction decode and register renaming
- Issue - Dynamic scheduling via reservation stations
- Execute - Out-of-order execution with multiple functional units
- Memory - Load/store with disambiguation
- Commit - In-order retirement via reorder buffer
| Module | Description |
|---|---|
FetchUnit.sv |
Instruction fetch with branch prediction |
DecodeUnit.sv |
Decode and dispatch logic |
RenameUnit.sv |
Register renaming (RAT + free list) |
ReservationStation.sv |
Tomasulo-style issue queue |
IssueUnit.sv |
Dynamic instruction scheduling |
IntegerUnits.sv |
ALU and branch execution units |
ROB.sv |
64-entry reorder buffer |
LoadQueue.sv / StoreQueue.sv |
Memory ordering |
MemoryDisambiguation.sv |
Load/store forwarding |
L1ICache.sv / L1DCache.sv |
L1 caches |
BranchPredictor.sv |
2-bit predictor with BTB |
RecoveryUnit.sv |
Misprediction/exception recovery |
The base pipeline design is derived from classic MIPS architecture references:
