model: support Rnj-1 #17811

philip-essential · 2025-12-06T02:19:47Z

This adds support for Rnj-1, which is an 8B model we just released. We've been using llama.cpp to play around with the model internally, and we released a GGUF checkpoint for the instruction-tuned version.

The model architecture is similar enough to Gemma3 that in Transformers/VLLM/SGLang we can reuse the same model file. However, in llama.cpp we need some small changes, so I've added a new implementation, based closely on the Gemma3 one. The changes are:

All layers use global attention.
Long-context is via YaRN.

Because our huggingface config.json uses "Gemma3ForCausalLM" as the architecture, convert_hf_to_gguf.py is unable to tell that these configs are for Rnj-1. The solution I came up with is to manually change the architecture to Rnj1ForCausalLM before converting the checkpoint. I added a note in convert_hf_to_gguf.py about this. But perhaps there's a better solution?

CISC · 2025-12-06T13:48:39Z

Because our huggingface config.json uses "Gemma3ForCausalLM" as the architecture, convert_hf_to_gguf.py is unable to tell that these configs are for Rnj-1. The solution I came up with is to manually change the architecture to Rnj1ForCausalLM before converting the checkpoint. I added a note in convert_hf_to_gguf.py about this. But perhaps there's a better solution?

Instead change llm_build_gemma3_iswa into a templated llm_build_gemma3, like f.ex. smallthinker and add support for YaRN and non-SWA in Gemma3Model conversion.

faisal-fida · 2025-12-07T11:13:08Z

@philip-essential Just following up on PR #17811 (Rnj-1 support).

Currently hitting an error: unknown model architecture: 'rnj1' when trying to load the GGUF. Any chance we can prioritize merging this so the community can use Rnj-1?

sirmo · 2025-12-07T16:29:56Z

I tested the current fork of this PR and it works pretty well with the published gguf Q4 quants. The model follows OpenCode (TUI coding agent) instructions well in my brief testing. Neat model!

(though 32K context size is a bit limiting for local coding agents) this might be a great agentic model for efficient execution. Thank you for all your work!

Hardware tested on: 7900xtx with ROCm backend.

philip-essential · 2025-12-07T22:03:04Z

Instead change llm_build_gemma3_iswa into a templated llm_build_gemma3, like f.ex. smallthinker and add support for YaRN and non-SWA in Gemma3Model conversion.

That makes sense. I can try and do that soon.

add support for rnj1

e1b7d46

philip-essential requested review from CISC and ggerganov as code owners December 6, 2025 02:19

github-actions bot added model Model specific python python script changes labels Dec 6, 2025

loci-dev mentioned this pull request Dec 6, 2025

UPSTREAM PR #17811: model: support Rnj-1 auroralabs-loci/llama.cpp#464

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model: support Rnj-1 #17811

model: support Rnj-1 #17811

philip-essential commented Dec 6, 2025

Uh oh!

CISC commented Dec 6, 2025

Uh oh!

faisal-fida commented Dec 7, 2025 •

edited

Loading

Uh oh!

sirmo commented Dec 7, 2025

Uh oh!

philip-essential commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

model: support Rnj-1 #17811

Are you sure you want to change the base?

model: support Rnj-1 #17811

Conversation

philip-essential commented Dec 6, 2025

Uh oh!

CISC commented Dec 6, 2025

Uh oh!

faisal-fida commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sirmo commented Dec 7, 2025

Uh oh!

philip-essential commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

faisal-fida commented Dec 7, 2025 •

edited

Loading