State-Bottlenecked Reasoning for Verifiable Referee Decision Tracing in Sports Broadcast Video
This repository contains the reference implementation for the StateTrace research paper. It provides the complete state-bottlenecked reasoning architecture, three-stage training pipeline, transition verification framework, and evaluation tools described in the paper.
StateTrace is the third paper in the sports adjudication trilogy:
- RuleGround — Perception to predicates: grounds raw video into structured game-state predicates
- RefTrace — Evidence to traces: generates verifiable reasoning traces over a rule knowledge graph
- StateTrace (this repo) — Traces to state transitions: adds explicit state bottleneck for per-step verification
StateTrace extends RefTrace with explicit state-bottlenecked reasoning for automated referee decision analysis. Where RefTrace uses a HistoryEncoder (Transformer over action/result pairs) to compress reasoning history, StateTrace introduces a typed state schema s_t = (E_t, V_t, R_t, C_t, D_t) that serves as a compact, verifiable bottleneck between each reasoning step. This enables deterministic per-transition verification, state-derived action masking, dense per-step reward signals, and a novel Stage III training loop. We validate on NFL broadcast video using the NFL-MH benchmark.
| Configuration | VTA | DA | TV | IAR |
|---|---|---|---|---|
| StateTrace-7B | 74.8 | 90.1 | 96.3 | 2.1% |
| RefTrace-7B (baseline) | 72.1 | 89.2 | -- | -- |
| w/o Stage III (ablation) | 73.2 | 89.8 | 91.7 | 5.4% |
| w/o action masking | 72.9 | 89.5 | 88.2 | 8.9% |
---
config:
layout: elk
look: neo
theme: neo
---
flowchart TB
Q["<b>Query</b><br>(play description)"] --> QE
Video["<b>Video Frames</b><br>[B, T, C, H, W]"] --> VLM
subgraph Policy ["StateTracePolicy"]
VLM["<b>QwenVLBackbone</b><br>Qwen3-VL-8B"] --> GE["<b>GraphEncoder</b><br>Heterogeneous GAT"]
QE["<b>QueryEncoder</b><br>Sentence-T5"]
SE["<b>StateEncoder</b><br>2-layer Transformer"]
GE & QE & SE --> F["<b>FusionLayer</b>"]
F --> AH["<b>ActionHead</b><br>+ action mask A(s_t)"]
end
subgraph Act ["Action Space"]
direction LR
A1["GET_RULE"] & A2["GET_STATE"] & A3["GET_EVENTS"]
A4["GET_VIDEO"] & A5["VERIFY_TEMPORAL"] & A6["STOP"]
end
AH --> Act
Act -->|"execute"| GSTH["<b>GSTH</b><br>Game-State Trace Hypergraph<br>4 node types · 6 edge types"]
GSTH -->|"result"| UPD["<b>U_θ</b><br>State Update"]
UPD -->|"s_{t+1}"| SE
A6 -->|"decision"| D["<b>CALL_CORRECT · CALL_INCORRECT · NO_FOUL</b>"]
| Component | Symbol | Description | Fields |
|---|---|---|---|
| Entities | E_t | Events and candidates | events_seen, candidate_event_ids |
| Evidence | V_t | Retrieved evidence | retrieved_state_features, video_segments, evidence_bindings, justification_pointers |
| Rules | R_t | Candidate rules | candidate_rule_ids |
| Constraints | C_t | Open constraints | open_constraints, temporal_requirements, conflicts |
| Decision | D_t | Decision status | decision_status |
# Clone and install
git clone https://github.com/sreevadde/statetrace.git && cd statetrace
pip install -e ".[dev]"
# Build the Game-State Trace Hypergraph
statetrace build-gsth --config configs/base.yaml
# Stage 1: Supervised Fine-Tuning
statetrace train --config configs/training/sft.yaml
# Stage 2: Terminal RL (GRPO)
statetrace train --config configs/training/grpo.yaml
# Stage 3: Transition RL (novel)
statetrace train --config configs/training/transition_rl.yaml
# Evaluate on NFL-MH-Core
statetrace eval --config configs/base.yaml \
--checkpoint checkpoints/stage3/final.ptfrom statetrace.models import StateTracePolicy
from statetrace.graph import GSTH
from statetrace.state import State, StateSerializer
# Load a pre-built GSTH
gsth = GSTH.load("data/gsth/gsth.pkl")
pyg_data = gsth.to_pyg()
# Build the policy from config
policy = StateTracePolicy.from_config(cfg)
# Initialize state and serialize
state = State()
serialized = StateSerializer().serialize(state)
# Run a single reasoning step
dist = policy(
query="Was the defensive pass interference call correct?",
gsth_data=pyg_data,
video_frames=video_tensor, # (B, T, C, H, W)
serialized_state=serialized, # structured text
)
# Sample an action with state-derived masking
action = policy.select_action(
query="Was the defensive pass interference call correct?",
gsth_data=pyg_data,
video_frames=video_tensor,
state=state, # used for A(s_t) mask
)StateTrace uses OmegaConf for hierarchical YAML configuration with CLI overrides.
configs/
├── base.yaml # Default hyperparameters for all components
├── model/
│ ├── base.yaml # Qwen3-VL-8B (default) + LoRA rank 64
│ ├── large.yaml # Qwen3-VL-32B
│ ├── small.yaml # Qwen3-VL-2B
│ ├── qwen25.yaml # Qwen2.5-VL-7B (paper baseline)
│ └── qwen35.yaml # Qwen3.5-9B (latest, unified VL)
├── training/
│ ├── sft.yaml # Stage 1: supervised fine-tuning
│ ├── grpo.yaml # Stage 2: terminal RL with NTR
│ └── transition_rl.yaml # Stage 3: transition RL (novel)
└── nfl/
├── core.yaml # NFL-MH-Core (frame-accurate, expert-labeled)
├── auto.yaml # NFL-MH-Auto (broadcast-scale, auto-extracted)
└── combined.yaml # Combined Core + Auto
Override any parameter from the command line:
statetrace train --config configs/training/sft.yaml \
--set training.lr=1e-5 \
--set training.batch_size=8- Training: 4x NVIDIA A100 80GB (SFT ~6h, GRPO ~10h, Stage III ~8h)
- Inference: 1x A100 40GB (or 2x A6000)
- GSTH construction: CPU-only, ~15 minutes
# Build GSTH from play-by-play data
statetrace build-gsth --config configs/base.yaml
# Stage 1: SFT on expert reasoning traces
statetrace train --config configs/training/sft.yaml
# Stage 2: Terminal RL with Normalized Trace Reward
statetrace train --config configs/training/grpo.yaml
# Stage 3: Transition RL with dense rewards
statetrace train --config configs/training/transition_rl.yaml
# Evaluate on NFL-MH-Core test set
statetrace eval --config configs/nfl/core.yaml \
--checkpoint checkpoints/stage3/final.pt
# Evaluate on NFL-MH-Auto test set
statetrace eval --config configs/nfl/auto.yaml \
--checkpoint checkpoints/stage3/final.ptResults are deterministic given fixed seeds (training.seed=42). Use --set training.seed={43,44} for the additional seeds reported in the paper.
@article{vadde2026statetrace,
title = {StateTrace: State-Bottlenecked Reasoning for Verifiable
Referee Decision Tracing in Sports Broadcast Video},
author = {Vadde, Sree Krishna},
journal = {arXiv preprint},
year = {2026}
}MIT