Skip to content

tiboitel/poke-vae

Repository files navigation

🧬 PokeVAE — Variational Autoencoder for Pokémon Stats, Types & Abilities

Model Summary

PokeVAE is a Variational Autoencoder (VAE) trained on Pokémon base data to learn a continuous latent representation of Pokémon base stats, types, and abilities.

The model enables:

  • Random Pokémon-like generation

This project is designed for creative exploration, procedural generation, and educational experimentation with VAEs.


Model Details

  • Model type: Variational Autoencoder (VAE)
  • Framework: PyTorch
  • Latent dimension: 16
  • Input modalities:
    • Base stats (standardized)
    • Pokémon types (multi-label)
    • Pokémon abilities (single-label)
  • Outputs:
    • Base stats (continuous)
    • Types (multi-label logits)
    • Ability (categorical logits)

Architecture

Encoder

  • Fully connected MLP
  • Input dimension:
    • 6 (stats) + 18 (types) + 286 (abilities) = 310
    • Hidden dimension: 128
  • Outputs:
    • Mean vector μ ∈ ℝ^16
    • Log-variance vector log σ² ∈ ℝ^16

Decoder

  • MLP with dropout
  • Hidden layers: [128, 64]
  • Output heads:
    • Stats head: linear (regression)
    • Type head: linear (binary logits)
    • Talent head: linear (categorical logits)

Training

Dataset

  • Pokémon dataset loaded from pokemons.json
  • Features:
  • Base stats: HP, Attack, Defense, Sp. Attack, Sp. Defense, Speed
  • Types: multi-hot encoding
  • Abilities: one-hot encoding (first listed ability only)

Preprocessing

  • Base stats are standardized using sklearn.StandardScaler
  • Types are treated as a multi-label classification problem
  • Abilities are treated as categorical classification

Loss Function

The total loss is a weighted sum of reconstruction losses and KL divergence:

L = MSE(stats)

BCE(types)

0.50 × CE(ability)

β × KL

Where:

  • MSE: Mean Squared Error for base stats
  • BCE: Binary Cross-Entropy with logits for types
  • CE: Cross-Entropy for abilities
  • KL: Kullback–Leibler divergence
  • β: linearly annealed during the first 100 epochs
  • β_max: 0.012

KL divergence is clamped to prevent posterior collapse.


Optimization

  • Optimizer: Adam
  • Learning rate: 3e-4
  • Batch size: 64
  • Training epochs: 600

Inference & Generation

Random Sampling

Samples from the latent prior and decodes into Pokémon-like entries:

  • Ensures at least one type is assigned
  • Ability selected via argmax
  • Stats are inverse-transformed for readability

Implemented in inference.py.

Intended Uses

Example of uses

  • Procedural Pokémon-like content generation
  • Latent space exploration
  • Educational demonstrations of VAEs
  • Creative tooling (fusions, variants)

Out-of-Scope Use

  • Competitive Pokémon balancing
  • Canonical or official Pokémon creation
  • Real-world decision making

Evaluation

This model is not quantitatively benchmarked.

Evaluation is qualitative and exploratory:

  • Plausibility of generated stats
  • Smoothness of latent interpolations
  • Diversity of generated forms

Limitations

  • Only the first listed ability is modeled
  • No hard constraints on stat realism beyond BST normalization
  • Type correlations are learned implicitly
  • Dataset biases directly affect generations

Ethical Considerations

This model generates fictional content inspired by Pokémon data.

It is intended strictly for educational and creative use.
All Pokémon-related concepts, names, and data are the property of their respective rights holders.


Reproducibility

Training

You can change training hyperparameters in config.json.

Start training:

python train.py

Saved artifacts:

Model weights
StandardScaler statistics
Type and ability vocabularies
Training configuration

Requirements

See requirements.txt:

PyTorch ≥ 2.1
scikit-learn
NumPy

Model Card Authors

Author: tiboitel Model name: poke-vae

About

poke-vae implements a Variational Autoencoder (VAE) in PyTorch to encode and decode Pokémon statistics. The VAE is structured to learn a compressed representation of Pokémon data which includes base stats and types, enabling the generation of new Pokémon-like data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages