Feat/refactoring: migration to Transformers v5, removing custom MoE backends, and misc. improvements by fedebotu · Pull Request #9 · RadicalNumerics/RND1

fedebotu · 2026-02-22T11:08:40Z

Overview

This PR replaces the custom MoE backend system (vLLM, SGLang, FlashInfer, HF -- also noting that newer version of these were quite buggy) with the recently released transformers v5's built-in Qwen3MoeExperts and Qwen3MoeTopKRouter. This drops ~250 lines of backend-specific code and lets transformers handle kernel dispatch automatically.

Additionally, several changes to the overall QOL, including linting, bug fixes, and documentation, have been added.

The older version that included the backends code will be left open at the backends brach for future reference.

Major Changes

Model (`rnd/modeling_rnd.py`)

Rewrote RND1SparseMoeBlock to use Qwen3MoeExperts + Qwen3MoeTopKRouter instead of manually routing tokens through per-expert MLPs
Removed all vLLM/SGLang/FlashInfer imports, weight-packing logic, and backend selection
Registered rnd1 in transformers' _MODEL_TO_CONVERSION_PATTERN so per-expert checkpoint weights are automatically fused into the 3D tensor format during loading
Added rotary embedding inv_freq recomputation in from_pretrained (these buffers are non-persistent and not stored in safetensors)
Simplified RND1DecoderLayer and RND1Attention by removing backend-conditional RMSNorm class selection
Replaced _init_weights with a no-op (weights always comAdditionally, several changes to the overall QOL including linting, bug fixes, and documentation have been added.e from a checkpoint)

Config (`rnd/configuration_rnd.py`)

Removed moe_backend parameter
Switched RoPE config to the v5 rope_parameters format

Demo script (`demo_rnd_generation.py`)

Replaced --moe_backend with --experts-implementation (optional, auto-detected when not set)
Normalized all CLI args to kebab-case (--top-k, --num-steps, etc.) for consistency with other projects

Project setup

Bumped minimum: transformers>=5.0.0, torch>=2.8
Removed optional deps for vllm, sglang, flashinfer
Reduced ruff line-length from 120 to 100, added bugbear/simplify/isort rules
Cleaned up pre-commit config (removed generic hooks, updated ruff, fixed uvx commands)
Removed unused SVG asset

…and linting

fedebotu added 7 commits February 22, 2026 09:59

chore: remove unused assets

6b687fe

feat: pyproject, linting

f461a13

refactor: update to transformers v5, remove old backends, misc fixes …

37b829c

…and linting

docs: update with new instructions

c1886ad

lint: keep missing variable rule; correct uvx command

d047e55

chore: remove unused version check

ab1ec20

fix: use model config

d6661a0

fedebotu changed the title ~~Feat/refactoring~~ Feat/refactoring: migration to Transformers v5 and removing custom MoE backends Feb 22, 2026

fedebotu marked this pull request as ready for review February 22, 2026 11:34

fedebotu requested a review from keshik6 February 22, 2026 11:35

fedebotu changed the title ~~Feat/refactoring: migration to Transformers v5 and removing custom MoE backends~~ Feat/refactoring: migration to Transformers v5, removing custom MoE backends, and misc. improvements Feb 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/refactoring: migration to Transformers v5, removing custom MoE backends, and misc. improvements#9

Feat/refactoring: migration to Transformers v5, removing custom MoE backends, and misc. improvements#9
fedebotu wants to merge 7 commits intomainfrom
feat/refactoring

fedebotu commented Feb 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fedebotu commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Major Changes

Model (rnd/modeling_rnd.py)

Config (rnd/configuration_rnd.py)

Demo script (demo_rnd_generation.py)

Project setup

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fedebotu commented Feb 22, 2026 •

edited

Loading

Model (`rnd/modeling_rnd.py`)

Config (`rnd/configuration_rnd.py`)

Demo script (`demo_rnd_generation.py`)