Follow-up to #448. Lifts the sparse-mario algorithm into a domain-agnostic crate so anyone can point it at their own examples instead of just Mario level slices.
Plain-language summary
Sparse-Mario was a working demo of "use a sparse attention kernel as a lookup table over a corpus of examples, no training required." It generated Mario levels.
This new crate is the same idea, but corpus-agnostic — you supply any small token alphabet and a few example sequences, and you get back two pipelines:
- Stream mode — produce one token at a time, like writing into a text box. ~12 microseconds per token.
- Fill mode — start from a blank canvas and fill it in everywhere at once over a few rounds. Like content-aware fill, but for tokens. Can also repair partial sequences.
No GPUs, no PyTorch, no model files. The examples are the model.
What landed
New crate at crates/ruvllm_retrieval_diffusion/:
src/lib.rs (~600 lines) — generic Retriever + Diffuser + SamplingConfig, parameterised by RetrievalConfig (vocab_size, head_dim, pos_scale, mask_sentinel, diffusion context weights).
examples/drum_patterns.rs — second-domain proof: 5-token drum-machine vocab, 4 hand-authored 16-step patterns as corpus, generates 4-bar loops via both modes (AR 268µs, diffusion 5.7ms on a 9950X).
README.md — public-facing writeup.
- 10 unit tests, all passing.
Branch: sparse-mario (commit 977479eff, just pushed).
Public gist (plain-language version of the README): https://gist.github.com/ruvnet/af1638d7db2961f60d732467b4282ad5
Why this is interesting beyond Mario
The Sparse-Mario benchmark already showed that bidirectional fill mode beats every non-trivial baseline by ~4× on aggregate quality metrics. That win is structural — the Markov-1 baseline has perfect bigram statistics and still loses, because it can't use right-context to inform left-context.
This crate makes that bidirectional fill capability available to any small-vocab token domain in four lines of Rust.
Suggested follow-up domains
Easy plug-ins (corpus and tokenizer needed):
Suggested architectural follow-ups (already filed in #448)
How to access the work
# clone + build
git fetch origin sparse-mario
git checkout sparse-mario
cargo run --release -p ruvllm_retrieval_diffusion --example drum_patterns
cargo test -p ruvllm_retrieval_diffusion
Filed with gh issue create on behalf of @ruvnet from a Claude Code session that drove the generalisation iter.
Follow-up to #448. Lifts the
sparse-marioalgorithm into a domain-agnostic crate so anyone can point it at their own examples instead of just Mario level slices.Plain-language summary
Sparse-Mario was a working demo of "use a sparse attention kernel as a lookup table over a corpus of examples, no training required." It generated Mario levels.
This new crate is the same idea, but corpus-agnostic — you supply any small token alphabet and a few example sequences, and you get back two pipelines:
No GPUs, no PyTorch, no model files. The examples are the model.
What landed
New crate at
crates/ruvllm_retrieval_diffusion/:src/lib.rs(~600 lines) — genericRetriever+Diffuser+SamplingConfig, parameterised byRetrievalConfig(vocab_size, head_dim, pos_scale, mask_sentinel, diffusion context weights).examples/drum_patterns.rs— second-domain proof: 5-token drum-machine vocab, 4 hand-authored 16-step patterns as corpus, generates 4-bar loops via both modes (AR 268µs, diffusion 5.7ms on a 9950X).README.md— public-facing writeup.Branch:
sparse-mario(commit977479eff, just pushed).Public gist (plain-language version of the README): https://gist.github.com/ruvnet/af1638d7db2961f60d732467b4282ad5
Why this is interesting beyond Mario
The Sparse-Mario benchmark already showed that bidirectional fill mode beats every non-trivial baseline by ~4× on aggregate quality metrics. That win is structural — the Markov-1 baseline has perfect bigram statistics and still loses, because it can't use right-context to inform left-context.
This crate makes that bidirectional fill capability available to any small-vocab token domain in four lines of Rust.
Suggested follow-up domains
Easy plug-ins (corpus and tokenizer needed):
Suggested architectural follow-ups (already filed in #448)
How to access the work
Filed with
gh issue createon behalf of @ruvnet from a Claude Code session that drove the generalisation iter.