Skip to content

Alias-aware token threading for better parallelism #1

@maleadt

Description

@maleadt

All memory operations (loads, stores, atomics) are threaded through a single global token chain. This is correct but conservative—operations on independent arrays are serialized unnecessarily.

Proposed improvement

Implement alias-aware token threading:

  1. Alias analysis: Compute which pointers may refer to the same memory region (alias sets)
  2. Per-set token chains: Thread tokens only between operations that may alias
  3. Loop parallel stores: Identify stores in loops with non-overlapping indices across iterations—these can skip token dependencies entirely

Why

The current sequential approach prevents parallelism between independent memory operations. For example, loading from array a and storing to array b don't need ordering constraints if they're provably disjoint. Alias-aware threading preserves correctness while enabling the hardware to execute independent operations concurrently.

Reference implementation

cuTile Python implements this in:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions