Proposal
Add a TransformerBridge adapter for FalconH1ForCausalLM (TII Falcon-H1), a hybrid that runs attention and Mamba-2 in parallel.
Motivation
In each Falcon-H1 block, attention heads and Mamba-2 heads process the same input side by side and their outputs are concatenated. Because both paths see the same token, researchers can ablate one and compare, which is a clean way to study what attention adds over a state-space layer. The family ships very small checkpoints (down to a 90M and a 0.5B), so it is cheap to verify.
This runs hand-in-hand with #1402 in extending our capabilities to research Mamba-base state-space models.
Gap scan (2026-06-18): 14 models, ~171K downloads.
Pitch
Expose the attention and Mamba-2 sub-paths in each block as separate hookable streams.
- Claude Code users can scaffold with
/add-model-support tiiuae/Falcon-H1-0.5B-Base.
- Register at the four sites listed in contributing.md.
- Verify smallest-first:
tiiuae/Falcon-H1-Tiny-90M-Instruct, then tiiuae/Falcon-H1-0.5B-Base.
Additional context
Checklist
Proposal
Add a TransformerBridge adapter for
FalconH1ForCausalLM(TII Falcon-H1), a hybrid that runs attention and Mamba-2 in parallel.Motivation
In each Falcon-H1 block, attention heads and Mamba-2 heads process the same input side by side and their outputs are concatenated. Because both paths see the same token, researchers can ablate one and compare, which is a clean way to study what attention adds over a state-space layer. The family ships very small checkpoints (down to a 90M and a 0.5B), so it is cheap to verify.
This runs hand-in-hand with #1402 in extending our capabilities to research Mamba-base state-space models.
Gap scan (2026-06-18): 14 models, ~171K downloads.
Pitch
Expose the attention and Mamba-2 sub-paths in each block as separate hookable streams.
/add-model-support tiiuae/Falcon-H1-0.5B-Base.tiiuae/Falcon-H1-Tiny-90M-Instruct, thentiiuae/Falcon-H1-0.5B-Base.Additional context
hf_scraperarchitecture-gaps pass (2026-06-18).Checklist