Skip to content

Aryan810/RISC-V-Processor-CS224-Group-5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

5-Stage RISC-V Pipeline Processor (RV32IM)

A fully functional 5-stage pipelined RISC-V processor implementing the RV32I base integer instruction set with RV32M (Multiply/Divide) extension. Designed for simulation and FPGA deployment.

RISC-V Verilog Pipeline Extension

Table of Contents


Features

  • 5-Stage Pipeline: IF → ID → EX → MEM → WB
  • RV32I Base ISA: Full support for base integer instructions
  • RV32M Extension: Hardware multiply (4 cycles) and divide (34 cycles) unit
  • Data Forwarding: EX-to-EX and MEM-to-EX forwarding paths
  • Hazard Detection: Load-use hazard detection with pipeline stall
  • Branch Handling: Branch prediction with flush on misprediction
  • Integrated Memories: 4KB instruction memory and 4KB data memory
  • FPGA Ready: Includes constraints for Nexys A7/Artix-7 boards

Architecture Overview

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│    IF    │───▶│    ID    │───▶│    EX    │───▶│   MEM    │───▶│    WB    │
│  Fetch   │    │  Decode  │    │ Execute  │    │  Memory  │    │Writeback │
└──────────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘
     │               │               │               │               │
   ┌─┴─┐           ┌─┴─┐           ┌─┴─┐           ┌─┴─┐           ┌─┴─┐
   │IF/│           │ID/│           │EX/│           │MEM│           │Reg│
   │ID │           │EX │           │MEM│           │/WB│           │   │
   └───┘           └───┘           └───┘           └───┘           └───┘
                                     │               │               │
                                     └───────────────┴───────────────┘
                                              Data Forwarding

Pipeline Stages

Stage 1: Instruction Fetch (IF)

  • Fetches instruction from instruction memory
  • Manages Program Counter (PC)
  • Handles branch target address updates
  • File: modules/fetch.v

Stage 2: Instruction Decode (ID)

  • Decodes instruction fields (opcode, funct3, funct7)
  • Reads operands from register file
  • Generates immediate values (I/S/B/U/J types)
  • Produces control signals
  • File: modules/decode.v

Stage 3: Execute (EX)

  • ALU operations (arithmetic, logical, shifts)
  • Branch condition evaluation
  • Branch target calculation
  • Multiply/Divide unit (RV32M)
  • Files: modules/execute.v, modules/mul_div.v

Stage 4: Memory (MEM)

  • Data memory read/write operations
  • Byte/Halfword/Word addressing with alignment
  • Store data formatting
  • File: modules/memory.v

Stage 5: Write Back (WB)

  • Selects result (ALU or memory)
  • Load data sign extension
  • Writes to register file
  • File: modules/writeback.v

Directory Structure

full_pipeline/
├── README.md                   # This file
├── modules/                    # Verilog source files
│   ├── pipeline.v              # Top-level pipeline module
│   ├── fetch.v                 # IF stage
│   ├── decode.v                # ID stage
│   ├── execute.v               # EX stage (includes ALU)
│   ├── memory.v                # MEM stage
│   ├── writeback.v             # WB stage
│   ├── mul_div.v               # RV32M multiply/divide unit
│   ├── opcode.vh               # Opcode & parameter definitions
│   ├── top_fpga.v              # FPGA top module wrapper
│   └── constraint.xdc          # Xilinx FPGA constraints
├── testBenches/                # Testbench files
│   ├── tb_pipeline.v           # Main pipeline testbench
│   ├── tb_rv32m.v              # RV32M extension testbench
│   ├── imem.hex                # Instruction memory initialization
│   └── dmem.hex                # Data memory initialization
├── sim/                        # Simulation outputs
│   └── sim1/                   # Simulation workspace
└── block_diagrams/             # Mermaid architecture diagrams
    ├── pipeline_top.mermaid    # Top-level pipeline diagram
    ├── fetch_stage.mermaid     # IF stage diagram
    ├── decode_stage.mermaid    # ID stage diagram
    ├── execute_stage.mermaid   # EX stage diagram
    ├── memory_stage.mermaid    # MEM stage diagram
    └── writeback_stage.mermaid # WB stage diagram

Supported Instructions

RV32I Base Instructions

Type Instructions
R-Type ADD, SUB, SLL, SLT, SLTU, XOR, SRL, SRA, OR, AND
I-Type (Arithmetic) ADDI, SLTI, SLTIU, XORI, ORI, ANDI, SLLI, SRLI, SRAI
I-Type (Load) LB, LH, LW, LBU, LHU
S-Type (Store) SB, SH, SW
B-Type (Branch) BEQ, BNE, BLT, BGE, BLTU, BGEU
U-Type LUI, AUIPC
J-Type JAL, JALR

RV32M Extension (Multiply/Divide)

Instruction Description Cycles
MUL Multiply, lower 32 bits 4
MULH Multiply high (signed × signed) 4
MULHSU Multiply high (signed × unsigned) 4
MULHU Multiply high (unsigned × unsigned) 4
DIV Signed division (quotient) 34
DIVU Unsigned division (quotient) 34
REM Signed remainder 34
REMU Unsigned remainder 34

Block Diagrams

Top-Level Pipeline Architecture

flowchart LR
    subgraph IF[IF Stage]
        PC[[PC]] --> ADD4["+4"]
        PC --> IMEM[["IMEM"]]
        ADD4 --> PC_MUX{{PC MUX}}
        PC_MUX --> PC
    end
    
    subgraph ID[ID Stage]
        CTRL["Control"]
        IMM["Imm Gen"]
        REG[["RegFile"]]
    end
    
    subgraph EX[EX Stage]
        FWD_A{{Fwd A}}
        FWD_B{{Fwd B}}
        ALU[["ALU"]]
        MULDIV[["MUL/DIV"]]
        BRANCH["Branch"]
    end
    
    subgraph MEM[MEM Stage]
        DMEM[["DMEM"]]
    end
    
    subgraph WB[WB Stage]
        LOAD["Load Align"]
        WB_MUX{{WB MUX}}
    end
    
    IF --> ID --> EX --> MEM --> WB
    
    EX -.->|fwd| ID
    MEM -.->|fwd| ID
    WB -->|write| REG
    
    BRANCH -->|taken| PC_MUX
Loading

Hazard Handling

┌─────────────────────────────────────────────────────────────────┐
│                      Hazard Detection Unit                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Load-Use Hazard:                                               │
│  ┌──────┐    ┌──────┐                                           │
│  │ LOAD │───▶│ USE  │  → Stall IF/ID for 1 cycle               │
│  │ (EX) │    │ (ID) │                                           │
│  └──────┘    └──────┘                                           │
│                                                                  │
│  Data Forwarding:                                               │
│  ┌──────┐    ┌──────┐                                           │
│  │ ALU  │═══▶│ ALU  │  → Forward from EX/MEM or MEM/WB         │
│  │(MEM) │    │ (EX) │                                           │
│  └──────┘    └──────┘                                           │
│                                                                  │
│  Control Hazard:                                                │
│  ┌────────┐                                                     │
│  │ BRANCH │  → Flush IF/ID on branch taken                     │
│  │ TAKEN  │                                                     │
│  └────────┘                                                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Getting Started

Prerequisites

  • Simulation: Icarus Verilog (iverilog), Verilator, or any Verilog simulator
  • Waveform Viewing: GTKWave
  • FPGA Synthesis: Xilinx Vivado (for FPGA deployment)

Quick Start

# Clone the repository
git clone https://github.com/your-username/3-Stage-Pipeline.git
cd 3-Stage-Pipeline/full_pipeline

# Run simulation with Icarus Verilog
cd testBenches
iverilog -o tb_rv32m.vvp -I../modules ../modules/pipeline.v tb_rv32m.v
vvp tb_rv32m.vvp

# View waveforms
gtkwave tb_rv32m.vcd

Simulation

Running the RV32M Test

cd testBenches
iverilog -o tb_rv32m.vvp -I../modules ../modules/pipeline.v tb_rv32m.v
vvp tb_rv32m.vvp

Expected Output:

========================================
RV32M Extension Test Results
========================================

Register File Contents:
  x1 = 5 (expected: 5)
  x2 = 7 (expected: 7)
  x3 = 35 (expected: 35, from MUL 5*7)
  x4 = 100 (expected: 100)
  x5 = 14 (expected: 14, from DIV 100/7)
  x6 = 2 (expected: 2, from REM 100%7)
  x7 = 1 (expected: 1)
  x8 = 2 (expected: 2)

✓ ALL TESTS PASSED!
========================================

Running the General Pipeline Test

cd testBenches
iverilog -o tb_pipeline.vvp -I../modules ../modules/pipeline.v tb_pipeline.v
vvp tb_pipeline.vvp
gtkwave tb_pipeline.vcd

Custom Test Programs

  1. Write RISC-V assembly code
  2. Assemble to machine code (use riscv32-unknown-elf-gcc)
  3. Convert to hex format
  4. Update imem.hex with your instructions
  5. Update dmem.hex with initial data (if needed)

FPGA Deployment

Target Board

  • Primary: Digilent Nexys A7-100T (Artix-7)
  • Clock: 100 MHz
  • Reset: Active-low push button
  • Debug Output: 16 LEDs showing lower 16 bits of CPU-computed result
  • CPU Operation Mode (via buttons/switches):
    • sw[2:0] = opcode (3-bit)
    • sw[15:3] = signed operand (13-bit)
    • btnl/btnc/btnr/btnu = execute selected operation
    • btnd = clear result/status
    • 8-digit 7-segment shows signed decimal result
    • divide-by-zero shows Err0 on 7-seg and 0xDEAD on LEDs

Synthesis Steps (Vivado)

  1. Create a new Vivado project
  2. Add all source files from modules/
  3. Set top_fpga.v as the top module
  4. Add constraint.xdc for pin assignments
  5. Run Synthesis → Implementation → Generate Bitstream
  6. Program the FPGA

Pin Assignments

Signal Pin Description
clk E3 100 MHz system clock
reset C12 Active-low reset (CPU reset button)
sw[2:0] J15/L16/M13 Opcode select
sw[15:3] Various 13-bit signed operand
btnl P17 Execute
btnc N17 Execute
btnr M17 Execute
btnu M18 Execute
btnd P18 Clear
led[0:15] Various Result low 16 bits (0xDEAD on div-by-zero)

Technical Details

Memory Configuration

Memory Size Address Range Access
Instruction (IMEM) 4 KB 0x000 - 0xFFF Word-aligned read
Data (DMEM) 4 KB 0x000 - 0xFFF Byte/Half/Word R/W

Register File

  • 32 general-purpose registers (x0-x31)
  • x0 is hardwired to zero
  • Dual read ports, single write port
  • Write-back bypass for same-cycle read-after-write

Forwarding Paths

Source Target Condition
EX/MEM EX (rs1/rs2) ALU result available
MEM/WB EX (rs1/rs2) Memory load or ALU result
WB ID (reg read) Register file bypass

Pipeline Stall Conditions

  1. Load-Use Hazard: When a load instruction is followed by an instruction that uses the loaded value
  2. Multiply/Divide: Multi-cycle operations stall the pipeline front-end

Multiply/Divide Unit

  • Multiply: 4-cycle pipelined operation using 16-bit partial products
  • Divide: 34-cycle iterative restoring division algorithm
  • Division by Zero: Returns 0xFFFFFFFF for quotient, dividend for remainder

Module Interfaces

Top-Level Pipeline (pipe)

module pipe #(
    parameter [31:0] RESET = 32'h0000_0000
)(
    input               clk,
    input               reset,      // Active-low
    input               stall,      // External stall
    output              exception,
    output [31:0]       pc_out
);

Multiply/Divide Unit (mul_div)

module mul_div (
    input               clk,
    input               reset,
    input               start,      // Start operation
    input               is_mul,     // 1=multiply, 0=divide
    input  [2:0]        funct3,     // Operation subtype
    input  [31:0]       operand_a,
    input  [31:0]       operand_b,
    output [31:0]       result,
    output              busy,
    output              done        // 1-cycle pulse
);

References


License

This project is part of the CS224 Computer Architecture coursework at IIT Guwahati.


Authors

Group 5 - CS224 Computer Architecture Lab

  • Arkadeb Manna
  • Mehul Raj
  • Ashutosh Kumar
  • Avanish Pandey
  • Aryan Gupta

Acknowledgments

  • IIT Guwahati, Department of Computer Science
  • RISC-V Foundation for the open ISA specification

About

A fully functional 5-stage pipelined RISC-V processor implementing the RV32I base integer instruction set with RV32M (Multiply/Divide) extension. Designed for simulation and FPGA deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors