Summary
In the a2a3 tensormap_and_ringbuffer runtime, when an AIV core in a 1C2V cluster is already executing a task, the scheduler should not dispatch a MIX task to that AIV's core group. The MIX task would conflict with the in-flight AIV task because it requires both the AIC and both AIVs of the group.
Motivation / Use Case
Observed while iterating on models/deepseek/v4/moe.py. A MIX task was dispatched to a core group whose AIV slot was already occupied by an earlier AIV-only task, producing a conflict (screenshot to be attached).
The current scheduler appears to admit a MIX task into a group based on AIC availability without checking that all AIVs in the group are free. Because MIX requires the entire 1C2V cluster, partial-occupancy AIV state must be considered before MIX admission.
Without this guard, mixed AIV-only + MIX workloads on the same cluster (e.g. MoE pipelines that interleave vector-only and mix kernels) can hit hard-to-diagnose dispatch conflicts.
Proposed API / Behavior
MIX-queue admission check should be tightened from "AIC group free" to "AIC free AND every AIV in the same group free." Concretely, when picking a target group for a MIX task, the scheduler must consult the per-AIV occupancy state in addition to the group/AIC state, and skip groups where any AIV is currently running an AIV-only task.
Equivalently, AIV-only dispatch should reserve the AIV slot in a way that MIX admission can observe before it reserves the cluster.
Alternatives Considered
- User-side serialization (insert a barrier so AIV-only tasks drain before MIX): pushes scheduler responsibility onto kernel authors and gives up overlap that is otherwise legal.
- Always reserve full cluster for AIV-only tasks: loses AIV parallelism for vector-only workloads.
Additional Context
Summary
In the a2a3
tensormap_and_ringbufferruntime, when an AIV core in a 1C2V cluster is already executing a task, the scheduler should not dispatch a MIX task to that AIV's core group. The MIX task would conflict with the in-flight AIV task because it requires both the AIC and both AIVs of the group.Motivation / Use Case
Observed while iterating on
models/deepseek/v4/moe.py. A MIX task was dispatched to a core group whose AIV slot was already occupied by an earlier AIV-only task, producing a conflict (screenshot to be attached).The current scheduler appears to admit a MIX task into a group based on AIC availability without checking that all AIVs in the group are free. Because MIX requires the entire 1C2V cluster, partial-occupancy AIV state must be considered before MIX admission.
Without this guard, mixed AIV-only + MIX workloads on the same cluster (e.g. MoE pipelines that interleave vector-only and mix kernels) can hit hard-to-diagnose dispatch conflicts.
Proposed API / Behavior
MIX-queue admission check should be tightened from "AIC group free" to "AIC free AND every AIV in the same group free." Concretely, when picking a target group for a MIX task, the scheduler must consult the per-AIV occupancy state in addition to the group/AIC state, and skip groups where any AIV is currently running an AIV-only task.
Equivalently, AIV-only dispatch should reserve the AIV slot in a way that MIX admission can observe before it reserves the cluster.
Alternatives Considered
Additional Context
models/deepseek/v4/moe.py(currentmoe_rewritebranch)src/a2a3/runtime/tensormap_and_ringbuffer