Zamlet - A RISC-V vector processing unit

This is an exploratory project where I'm trying to create a vector processing unit for a RISC-V core that scales to very large numbers of lanes.

It may be useful for applications that operate on large vectors, with the control flow relatively independent of the vector data. Applications such as Fully Homomorphic Encryption and Machine Learning often fit into this category.

I've made a start on some docs at benreynwar.github.io/zamlet/.

Approach

Lanes are arranged in a grid. If we want the design to scale to large numbers of lanes, we don't really have any other choices.

The lanes are connected with a mesh network. The standard approach for connecting the lanes would be with a crossbar. This works really well for small numbers of lanes, but becomes impractical as the number of lanes becomes large, both because of the crossbar itself and because of the buffers necessary to keep everything synchronous. A mesh network is an alternative that will work nicely as long as most of the data movement is fairly local.

An additional layer of hierarchy is introduced between the lane and the processor. As our number of lanes grows large it becomes useful to add another layer of hierarchy into the design. A grouping of lanes share an instruction buffer and other logic that is useful to keep fairly close to the lanes, but is too expensive to replicate in each lane.

Keep data local where possible, message passing between lanes when that is not possible. Common operations should result in minimal data movement. We want to minimize the movement of data in and out of the lanes. We distribute both the cache SRAM and the vector register file throughout the lanes, and ideally instructions should just be moving data between this cache SRAM, the vector register file slice and the lane's ALU. For instructions that do need to move data between lanes, this is done by message passing. This should be reasonably efficient when the data is moving between lanes close to one another. It will be inefficient when we are moving data large distances (both latency and throughput).

Vector memory lines and vector registers have a physical byte ordering that is controlled by an element-width setting. Each vector memory line and each vector register has an 'element-width'. This determines the order in which bytes are stored in the physical memory. If this 'element-width' matches the actual element width of the data then this will help keep the data local when vectors with different element-widths interact. The 'element-width' of the lines in each page is stored in a supplemental page table.

Custom hardware to synchronize the lane groupings. Because of the message passing approach, the lane groupings can often be out of sync with one another. Rather than building synchronization out of the network-based message passing we add specialized hardware for synchronizing between the lane groupings when this is required.

Setup

Dependencies are installed using nix. The build itself is done using bazel.

Install nix

Add the following to /etc/nix/nix.conf

extra-experimental-features = nix-command flakes
extra-substituters = https://nix-cache.fossi-foundation.org
extra-trusted-public-keys = nix-cache.fossi-foundation.org:3+K59iFwXqKsL7BNu6Guy0v+uTlwsxYQxjspXzqLYQs=

This allows nix to use the precompiled FOSSi binaries which speeds things up a bunch.

Run nix-shell in the project directory.
Use bazel to build a target

Name		Name	Last commit message	Last commit date
Latest commit History 317 Commits
.claude/skills/wrap-up		.claude/skills/wrap-up
.github		.github
bazel		bazel
configs		configs
docs		docs
docs_llm		docs_llm
dse		dse
nix		nix
python		python
scripts		scripts
src		src
.bazelproject		.bazelproject
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.gitignore		.gitignore
BUILD		BUILD
CLAUDE.md		CLAUDE.md
MODULE.bazel		MODULE.bazel
README.md		README.md
docker-compose.yml		docker-compose.yml
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zamlet - A RISC-V vector processing unit

Approach

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Zamlet - A RISC-V vector processing unit

Approach

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages