The inspiration for ALSI came from the intersection of CRSM (decoupling reasoning from generation) and MIT's RLM (Recursive Language Models).
If a State Space Model's recurrent state
If we can map external facts to latent state deltas (
- Never Forgets: Key information is periodically re-injected into the state.
- Scales Infinitely: Memory is handled externally but accessed internally.
- Zero-Latency: Information is available during the forward pass, not via slow tool-calls.
Unlike Transformers, Mamba models have a fixed-size bottleneck. This makes them the perfect candidates for "state surgery"—there is a single, well-defined point of intervention.
- Manifold Brittleness: The state space might be too sensitive to perturbations (The Coherence Gap).
- Encoding Complexity: Mapping "John lives in Paris" to a vector that actually updates the model's "beliefs" may be non-linear beyond our current projection capacity.