Skip to content

Conversation

@Navaneeth-KunhiPurayil
Copy link
Contributor

@Navaneeth-KunhiPurayil Navaneeth-KunhiPurayil commented Dec 4, 2025

To add feature to support for double VLSU bandwidth to L1 scratchpad in Spatz cluster to accelerate memory-intensive workloads. This configuration can be enabled by setting double_bw parameter in spatz configuration file and doubling the number of spatz_nports w.r.t number of functional units (assuming 64b granularity).

Added

  1. spatz_doublebw_vlsu.sv - To implement the load-store functionality to generate spatz_nports parallel requests across 2 interfaces and commit each interface separately back to the VRF, synchronizing between them only at instruction completion.
  2. All functionality related to this feature in the spatz.sv, spatz_controller.sv, spatz_vrf.sv is enclosed with the pragma DOUBLE_BW.
  3. Added hardware address scrambling support in spatz_tcdm_interconnect.sv to misalign VLSU requests to avoid conflicts at the L1 for varying LMUL configurations.
  4. Added support for unrolled version of dp-fdotp and dp-faxpy for higher performance configurable using a -DUNROLL cmake option.
  5. spatz_cluster.doublebw.dram.hjson - configuration with double bw support to the default spatz core configuration.

@Navaneeth-KunhiPurayil Navaneeth-KunhiPurayil marked this pull request as ready for review December 13, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants