[refactor] Simplify and unify the TornadoVM layer planner infrastructure by orionpapadakis · Pull Request #101 · beehive-lab/GPULlama3.java

orionpapadakis · 2026-03-27T10:48:27Z

This PR is a structural refactoring of the tornadovm package — no behavior changes, no new model capabilities. The goal is to eliminate duplication across the FFN layer and layer planner hierarchies and establish a cleaner, more maintainable package structure.

Changes:

FFN layer hierarchy (`tornadovm/layers/`)

Introduced AbstractLogitsLayer to centralize shared setup logic for all logits layers (LogitsFP16Layer, LogitsQ8_0Layer, and Granite variants), replacing duplicated task graph and
scheduler boilerplate
Generalized AbstractFFNLayers<W, C> with type parameters for Weights and Configuration, giving typed access to weights and config in subclasses
Unified setupFFNLayerTaskGraphs() across all FFN subclasses; removed redundant fields and methods surfaced during the cleanup
Added MistralFP16FFNLayers and MistralQ8_0FFNLayers — Mistral-specific FFN layer implementations using MistralConfiguration, required after the generics tightening

Layer planner hierarchy (`tornadovm/layerplanner/`)

Centralized createTornadoInferencePlan(), all shared fields (activationLayer, ffnLayers, logitsLayer, scheduler, task graphs), and the GenericLayerPlanner interface implementations
into QuantizedLayerPlanner, eliminating near-total duplication between FP16LayerPlanner and Q8_0LayerPlanner
Added MistralFP16LayerPlanner and MistralQ8_0LayerPlanner with correct generic types (LlamaState, MistralConfiguration, LlamaTornadoWeights), fixing a ClassCastException caused by
Mistral being incorrectly routed to the Llama planners

Package reorganization

GenericLayerPlanner and QuantizationPlannerFactory moved to layerplanner/ root (were in a base/ subpackage)
QuantizedLayerPlanner moved to layerplanner/ root
FP16LayerPlanner co-located with FP16 concrete planners in model/fp16/
Q8_0LayerPlanner co-located with Q8_0 concrete planners in model/q8_0/

DeepSeek-R1-Distill-Qwen fix

Introduced DeepSeekR1Qwen model class (extends Qwen2) that correctly overrides getModelType() and shouldAddBeginOfText(), fixing a repetition loop caused by missing BOS token and
<think> prefix injection
Fixed Qwen3ChatFormat.getBeginOfText() to fall back to startHeader (<｜begin▁of▁sentence｜>) when no BOS alias is registered — DeepSeek reuses its first role-marker token as BOS
Fixed Qwen3Tokenizer.encode() to byte-map non-special text parts before BPE encoding, resolving a NoSuchElementException when encoding "\n" after splitting on special tokens like
<think>

…s get/set methods to AbstractFFNLayers for consistency

…abstractify it for visibility and consistency across all FFN layers

…tations

…gic across all subclasses

…or logits layers

…ot level

…e plan creation logic in layerplanner package

…ecific subpackages

…ared logic into AbstractLogitsLayer

orionpapadakis added 8 commits March 24, 2026 18:23

[refactor] Move and rename last FFN layer's TaskGraph ID field and it…

877550a

…s get/set methods to AbstractFFNLayers for consistency

[refactor] Rename setupFFNLayered() to setupFFNLayerTaskGraphs() and …

4ff7919

…abstractify it for visibility and consistency across all FFN layers

[refactor] Remove unused fields and methods across FFN layer implemen…

16659c7

…tations

[refactor] Generalize AbstractFFNLayers and unify task graph setup lo…

5baba0e

…gic across all subclasses

[refactor] Move GenericLayerPlanner to layerplanner package

bf6823d

[refactor] Introduce AbstractLogitsLayer to centralize shared logic f…

3c09bca

…or logits layers

[refactor] Move QuantizationPlannerFactory to layerplanner package ro…

dc76fde

…ot level

[refactor] Simplify and unify layer planners by centralizing inferenc…

080fea4

…e plan creation logic in layerplanner package

orionpapadakis requested review from mairooni, mikepapadim and stratika March 27, 2026 10:48

orionpapadakis added the refactoring label Mar 27, 2026

orionpapadakis force-pushed the refactor/simplify-layerplanner branch from bba5fec to 3fffabd Compare March 27, 2026 10:55

orionpapadakis added 5 commits March 27, 2026 12:57

[refactor] Move QuantizedLayerPlanner to layerplanner package root-level

170db11

[refactor] Move FP16LayerPlanner and Q8_0LayerPlanner to quantized-sp…

a26f2a9

…ecific subpackages

[refactor] Unify task graph setup for Logits layers and centralize sh…

4be811a

…ared logic into AbstractLogitsLayer

[refactor] Simplify and unify Activation task graph setup logic

3aa399b

Introduce DeepSeekR1Qwen model and integrate with Qwen2ModelLoader

a3f1450

orionpapadakis force-pushed the refactor/simplify-layerplanner branch from 3fffabd to a3f1450 Compare March 27, 2026 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[refactor] Simplify and unify the TornadoVM layer planner infrastructure#101

[refactor] Simplify and unify the TornadoVM layer planner infrastructure#101
orionpapadakis wants to merge 13 commits intomainfrom
refactor/simplify-layerplanner

orionpapadakis commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

orionpapadakis commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes:

FFN layer hierarchy (tornadovm/layers/)

Layer planner hierarchy (tornadovm/layerplanner/)

Package reorganization

DeepSeek-R1-Distill-Qwen fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

orionpapadakis commented Mar 27, 2026 •

edited

Loading

FFN layer hierarchy (`tornadovm/layers/`)

Layer planner hierarchy (`tornadovm/layerplanner/`)