Skip to content

sreevadde/statetrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StateTrace

State-Bottlenecked Reasoning for Verifiable Referee Decision Tracing in Sports Broadcast Video

Python 3.10+ License: MIT arXiv

This repository contains the reference implementation for the StateTrace research paper. It provides the complete state-bottlenecked reasoning architecture, three-stage training pipeline, transition verification framework, and evaluation tools described in the paper.

StateTrace is the third paper in the sports adjudication trilogy:

  1. RuleGround — Perception to predicates: grounds raw video into structured game-state predicates
  2. RefTrace — Evidence to traces: generates verifiable reasoning traces over a rule knowledge graph
  3. StateTrace (this repo) — Traces to state transitions: adds explicit state bottleneck for per-step verification

Overview

StateTrace extends RefTrace with explicit state-bottlenecked reasoning for automated referee decision analysis. Where RefTrace uses a HistoryEncoder (Transformer over action/result pairs) to compress reasoning history, StateTrace introduces a typed state schema s_t = (E_t, V_t, R_t, C_t, D_t) that serves as a compact, verifiable bottleneck between each reasoning step. This enables deterministic per-transition verification, state-derived action masking, dense per-step reward signals, and a novel Stage III training loop. We validate on NFL broadcast video using the NFL-MH benchmark.

Key Results on NFL-MH-Core

Configuration VTA DA TV IAR
StateTrace-7B 74.8 90.1 96.3 2.1%
RefTrace-7B (baseline) 72.1 89.2 -- --
w/o Stage III (ablation) 73.2 89.8 91.7 5.4%
w/o action masking 72.9 89.5 88.2 8.9%

Architecture

---
config:
  layout: elk
  look: neo
  theme: neo
---
flowchart TB
    Q["<b>Query</b><br>(play description)"] --> QE
    Video["<b>Video Frames</b><br>[B, T, C, H, W]"] --> VLM

    subgraph Policy ["StateTracePolicy"]
        VLM["<b>QwenVLBackbone</b><br>Qwen3-VL-8B"] --> GE["<b>GraphEncoder</b><br>Heterogeneous GAT"]
        QE["<b>QueryEncoder</b><br>Sentence-T5"]
        SE["<b>StateEncoder</b><br>2-layer Transformer"]
        GE & QE & SE --> F["<b>FusionLayer</b>"]
        F --> AH["<b>ActionHead</b><br>+ action mask A(s_t)"]
    end

    subgraph Act ["Action Space"]
        direction LR
        A1["GET_RULE"] & A2["GET_STATE"] & A3["GET_EVENTS"]
        A4["GET_VIDEO"] & A5["VERIFY_TEMPORAL"] & A6["STOP"]
    end

    AH --> Act
    Act -->|"execute"| GSTH["<b>GSTH</b><br>Game-State Trace Hypergraph<br>4 node types · 6 edge types"]
    GSTH -->|"result"| UPD["<b>U_θ</b><br>State Update"]
    UPD -->|"s_{t+1}"| SE
    A6 -->|"decision"| D["<b>CALL_CORRECT · CALL_INCORRECT · NO_FOUL</b>"]
Loading

State Schema (5 components, 13 fields)

Component Symbol Description Fields
Entities E_t Events and candidates events_seen, candidate_event_ids
Evidence V_t Retrieved evidence retrieved_state_features, video_segments, evidence_bindings, justification_pointers
Rules R_t Candidate rules candidate_rule_ids
Constraints C_t Open constraints open_constraints, temporal_requirements, conflicts
Decision D_t Decision status decision_status

Quick Start

# Clone and install
git clone https://github.com/sreevadde/statetrace.git && cd statetrace
pip install -e ".[dev]"

# Build the Game-State Trace Hypergraph
statetrace build-gsth --config configs/base.yaml

# Stage 1: Supervised Fine-Tuning
statetrace train --config configs/training/sft.yaml

# Stage 2: Terminal RL (GRPO)
statetrace train --config configs/training/grpo.yaml

# Stage 3: Transition RL (novel)
statetrace train --config configs/training/transition_rl.yaml

# Evaluate on NFL-MH-Core
statetrace eval --config configs/base.yaml \
    --checkpoint checkpoints/stage3/final.pt

Python API

from statetrace.models import StateTracePolicy
from statetrace.graph import GSTH
from statetrace.state import State, StateSerializer

# Load a pre-built GSTH
gsth = GSTH.load("data/gsth/gsth.pkl")
pyg_data = gsth.to_pyg()

# Build the policy from config
policy = StateTracePolicy.from_config(cfg)

# Initialize state and serialize
state = State()
serialized = StateSerializer().serialize(state)

# Run a single reasoning step
dist = policy(
    query="Was the defensive pass interference call correct?",
    gsth_data=pyg_data,
    video_frames=video_tensor,      # (B, T, C, H, W)
    serialized_state=serialized,    # structured text
)

# Sample an action with state-derived masking
action = policy.select_action(
    query="Was the defensive pass interference call correct?",
    gsth_data=pyg_data,
    video_frames=video_tensor,
    state=state,                    # used for A(s_t) mask
)

Configuration

StateTrace uses OmegaConf for hierarchical YAML configuration with CLI overrides.

configs/
├── base.yaml               # Default hyperparameters for all components
├── model/
│   ├── base.yaml            # Qwen3-VL-8B (default) + LoRA rank 64
│   ├── large.yaml           # Qwen3-VL-32B
│   ├── small.yaml           # Qwen3-VL-2B
│   ├── qwen25.yaml          # Qwen2.5-VL-7B (paper baseline)
│   └── qwen35.yaml          # Qwen3.5-9B (latest, unified VL)
├── training/
│   ├── sft.yaml             # Stage 1: supervised fine-tuning
│   ├── grpo.yaml            # Stage 2: terminal RL with NTR
│   └── transition_rl.yaml   # Stage 3: transition RL (novel)
└── nfl/
    ├── core.yaml            # NFL-MH-Core (frame-accurate, expert-labeled)
    ├── auto.yaml            # NFL-MH-Auto (broadcast-scale, auto-extracted)
    └── combined.yaml        # Combined Core + Auto

Override any parameter from the command line:

statetrace train --config configs/training/sft.yaml \
    --set training.lr=1e-5 \
    --set training.batch_size=8

Reproduction

Hardware Requirements

  • Training: 4x NVIDIA A100 80GB (SFT ~6h, GRPO ~10h, Stage III ~8h)
  • Inference: 1x A100 40GB (or 2x A6000)
  • GSTH construction: CPU-only, ~15 minutes

Reproducing Paper Results

# Build GSTH from play-by-play data
statetrace build-gsth --config configs/base.yaml

# Stage 1: SFT on expert reasoning traces
statetrace train --config configs/training/sft.yaml

# Stage 2: Terminal RL with Normalized Trace Reward
statetrace train --config configs/training/grpo.yaml

# Stage 3: Transition RL with dense rewards
statetrace train --config configs/training/transition_rl.yaml

# Evaluate on NFL-MH-Core test set
statetrace eval --config configs/nfl/core.yaml \
    --checkpoint checkpoints/stage3/final.pt

# Evaluate on NFL-MH-Auto test set
statetrace eval --config configs/nfl/auto.yaml \
    --checkpoint checkpoints/stage3/final.pt

Results are deterministic given fixed seeds (training.seed=42). Use --set training.seed={43,44} for the additional seeds reported in the paper.


Citation

@article{vadde2026statetrace,
  title   = {StateTrace: State-Bottlenecked Reasoning for Verifiable
             Referee Decision Tracing in Sports Broadcast Video},
  author  = {Vadde, Sree Krishna},
  journal = {arXiv preprint},
  year    = {2026}
}

License

MIT

About

Reference implementation for "StateTrace: Explicit State Modeling for Verifiable Multimodal Rule Reasoning." State-bottlenecked reasoning framework with typed state schema, constrained action decoding, transition verification, and dense transition-level rewards for multimodal sports rule adjudication.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages