-
Notifications
You must be signed in to change notification settings - Fork 0
analysis
Evaluation, interpretation, and archetype discovery from the trained model.
Progressive levels of "does it work?" — each tier subsumes the previous.
The model generates token sequences that could describe real gameplay.
Tests:
- Structural validity — tick boundaries, snapshot format, BOS/EOS placement
- Physical validity — movements stay in bounds, entities don't teleport
- Rule validity — collisions produce correct effects, buffs modify outcomes
- Temporal validity — events happen in plausible order (can't collect a coin that was already collected)
Metric: Validity rate = fraction of sampled sequences that pass all checks at temperature T.
Baseline: Random token sampling should have ~0% validity. A trained model should be >>50%.
The model has learned that context changes outcomes.
Tests:
- Seed with buff-active state → model predicts enemy removal on collision (not death)
- Seed with snake length > threshold → model predicts self-collision as possible
- Seed with high score → model predicts faster enemies (if the game has difficulty scaling)
Method: Compare P(outcome | context_A) vs P(outcome | context_B) where the two contexts differ in one rule-relevant variable.
Metric: KL divergence between conditional distributions. High divergence = the model distinguishes contexts.
The model has learned that different behavioral patterns exist as statistical regularities.
Tests:
- Generate long traces and cluster them by behavioral features
- Seed with different "opening moves" and observe if distinct strategies emerge
- Compare generated trace distributions to known agent types used in training
Method:
- Extract features from generated traces (% time near walls, food collection rate, enemy proximity tolerance, buff usage patterns)
- Cluster in feature space (k-means, DBSCAN, or just PCA + visual inspection)
- Compare clusters to ground-truth agent labels (if training data includes labeled agents)
Metric: Cluster purity if labels exist. Silhouette score if not. Qualitative: can a human read two sampled traces and say "these are different play styles"?
A "stance" is a behavioral mode that persists for multiple ticks then transitions.
Examples:
- Farming: prioritizing safe resource collection
- Aggression: seeking power-ups then engaging enemies
- Evasion: avoiding all threats, sacrificing score
- Exploration: covering map territory
Stances should appear as clusters in the model's hidden representations over time, not just in aggregate trace statistics.
Method:
- Run a trace through the trained model
- At each tick, extract the residual stream (hidden state after all layers)
- Project hidden states into 2D (PCA/t-SNE/UMAP)
- Color by game context (buff active, near enemy, near food)
- Look for clusters that correspond to behavioral modes
If stances are real, you'll see:
- Distinct clusters in hidden space
- Transitions between clusters that correspond to game events (got a buff → shift from evasion to aggression)
- Different agent types occupying different regions
Model the transitions as a discrete process:
farming →(get buff)→ aggression →(buff expires)→ evasion →(safe area)→ farming
If the model has learned this, you can:
- Predict when a player will shift stances
- Identify the triggers (game events that cause transitions)
- Classify players by their transition graphs (some players never go aggressive, some always do after a buff)
An archetype is a characteristic stance transition graph:
- The Farmer: farming → farming → farming (never shifts)
- The Hunter: farming →(buff)→ aggression →(buff expires)→ farming (opportunistic)
- The Berserker: aggression → aggression → aggression (always aggressive)
- The Survivor: evasion → farming → evasion (risk-averse, collects only when safe)
These aren't programmed. They emerge from clustering the transition graphs.
The long-term goal from the premise: identify player archetypes and give them unique passives/skills.
1. Player plays the game normally
2. Event stream records their gameplay
3. Trained model classifies their trace → archetype
4. Game offers skills/passives that complement their natural style
Farmer → "Harvest": coins within 2 tiles are auto-collected
Hunter → "Predator": buff duration +50%
Berserker → "Fury": speed boost when near enemies
Survivor → "Phantom": brief invulnerability on near-miss
Don't need the full transformer for classification. Once archetypes are defined:
- Extract behavioral features from a player's event trace
- Nearest-archetype in feature space
- Or: feed the trace into the transformer, read the hidden state, classify from that
Players' styles evolve. The system should:
- Track archetype over a sliding window, not lifetime aggregate
- Detect stance shifts and update accordingly
- Offer new skills when the player's style changes (reward experimentation)
| Aspect | MuZero / Dreamer | Game Grammar |
|---|---|---|
| State | Learned continuous latent | Explicit symbolic tokens |
| World model | Neural dynamics function | Next-token prediction |
| Interpretability | Opaque | Readable by construction |
| Objective | Maximize reward | Predict what happens |
| Output | Optimal policy | Behavioral grammar |
| Archetypes | Not applicable | Core output |
| Requires reward | Yes | No |
| Data | Self-play | Any player |
Key distinction: MuZero is prescriptive (what should happen), this is descriptive (what does happen). The Wittgensteinian framing is explicitly anti-prescriptive — grammar describes use, it doesn't dictate it.
See Theory for the philosophical foundation.
| Aspect | Feature Engineering | This System |
|---|---|---|
| Features | Hand-designed (KDA, heatmap) | Learned from event sequences |
| Archetypes | Pre-defined (Bartle types) | Emergent |
| Temporal | Usually aggregate stats | Full sequence modeling |
| Game-specific | Yes (features per game) | No (events per game, model is shared) |
| Discovers new types | No | Yes |
- How many archetypes? Is it 4 (Bartle)? Is it continuous? Is it game-dependent?
- Stance granularity: How long does a "stance" last? One tick? Ten? A full game phase? Probably varies, and that variance itself is informative.
- Online vs. offline: Can we classify in real-time (after each tick) or only after a full episode? Real-time is needed for adaptive progression.
- Cold start: How many events before classification is reliable? First 10 seconds? First minute? First game?
- Transferability: Does a player's archetype in Snake predict their archetype in Pac-Man? If yes, archetypes are about the player, not the game.