ChessHacks Bot - Project Documentation

Last Updated: 2025-11-15 Competition: ChessHacks 36-Hour Hackathon Track: Queen's Crown (Highest ELO Rating)

🎯 Competition Requirements

Core Rules

Neural network must be a critical component (engine fails without it)
Must generate only legal moves (illegal moves = disqualification)
Cannot use pre-trained chess models or existing engines (e.g., Stockfish)
Must train your own models

Platform Constraints

Game time limit: 1 minute per game (dynamic time management required)
Build time limit: 3 minutes (CPU + GPU) when deploying
Bot slots: 3 available (each with separate ELO; best slot counts)
File size: No files > 100MB in git (use HuggingFace for model weights)

Technical Requirements

Bot implementation: /src/main.py with get_move(pgn: str) -> str function
Input format: PGN (Portable Game Notation) string
Output format: UCI notation (e.g., e2e4, e1g1, e7e8q)
Deployment files: /src, serve.py, requirements.txt
Imports: Use relative imports (e.g., from .utils import ...)

🧠 Strategy: Two Model Approaches

Approach 1: Minimal Leela Chess Zero (PyTorch)

Repository: https://github.com/Rocketknight1/minimal_lczero
Architecture: ResNet-style CNN + MCTS
Key Optimizations: Squeeze-Excitation blocks, WDL value head, illegal move masking
Training: Modal (cloud GPU), 2200+ ELO data only
Weights: HuggingFace (download & cache for inference)
Target: 2000-2200 ELO with priority optimizations

Approach 2: ChessFormers

Repository: https://github.com/Atenrev/chessformers
Architecture: Transformer with relative position bias + MCTS
Key Optimizations: Chess-aware attention, multi-task learning, illegal move masking
Training: Modal (cloud GPU), 2200+ ELO data only
Weights: HuggingFace (download & cache for inference)
Target: 2100-2300 ELO with priority optimizations

Engineering Philosophy

See DESIGN_DECISIONS.md for detailed engineering analysis

Core Principles:

Data Quality > Quantity: Clean 1M high-ELO games beats dirty 10M
Fast Inference = More Search: Optimize for <100ms forward pass
Illegal Move Masking: +100-200 ELO, prevents disqualification
Dynamic Time Management: Spend time on complex positions, fast moves on simple ones
Parallel MCTS: 4-8x speedup via virtual loss and batch inference

🏗️ Bot Architecture

Pipeline Flow

PGN Input → Neural Network → Move Probabilities + Position Value → MCTS Search → UCI Move Output

Key Components

Neural Network: Policy (move probabilities) + Value (position evaluation)
MCTS Search: Explores promising moves guided by NN predictions
Time Management: Dynamically allocate search time based on remaining game time

📊 Training Pipeline

Data Source

Source: Lichess database or similar high-ELO chess matches
Format: PGN files → Preprocessed tensors
Filtering: Min 2000 Elo (both players)

Training Infrastructure

Platform: Modal (cloud GPU)
Model Storage: HuggingFace Hub
Inference: Download and cache weights on first run

Dataset Preprocessing

Follow the respective repository's preprocessing guidelines:

Minimal LCZero: See repo documentation
ChessFormers: See repo documentation

☁️ Modal Training Setup

Quick Start

Modal account: Authenticate with modal token new
HuggingFace token: Get write token from https://huggingface.co/settings/tokens
Create Modal secret: modal secret create huggingface-secret HF_TOKEN=hf_...
Upload training data: modal volume put chess-training-data <local_path> /data/processed
Launch training: Follow repo-specific instructions for Minimal LCZero or ChessFormers

Deployment Strategy

Slot 1: Minimal LCZero model
Slot 2: ChessFormers model
Slot 3: Best performing variant with tuned hyperparameters

Remember: Best ELO from best slot counts for final ranking!

📁 Project Structure

chesshacks/
├── CLAUDE.md              # This file
├── requirements.txt       # Python dependencies (REQUIRED for deployment)
├── serve.py              # Backend server (REQUIRED for deployment)
│
├── src/                  # Bot implementation (REQUIRED for deployment)
│   └── main.py          # Contains get_move(pgn: str) -> str function
│
├── training/            # Training code (follows Minimal LCZero / ChessFormers structure)
│   ├── data/           # PGN files and preprocessed tensors
│   ├── scripts/        # Training scripts for Modal
│   └── configs/        # Training configurations
│
└── devtools/           # Local development environment (exclude from deployment)
    └── (Next.js frontend for testing)

Deployment Checklist:

✅ Include: /src, serve.py, requirements.txt
❌ Exclude: /devtools, .venv, .env.local (add to .gitignore)
❌ Exclude: Model weights > 100MB (use HuggingFace instead)

🚀 Development Workflow

Local Testing

cd devtools
npm run dev  # Starts Next.js frontend + Python backend
# Visit http://localhost:3000 to test bot

Training on Modal

# Follow Minimal LCZero or ChessFormers training instructions
# Upload trained weights to HuggingFace

Bot Implementation (`/src/main.py`)

def get_move(pgn: str, wtime: int = 60000, btime: int = 60000) -> str:
    """
    Generate best move using NN-guided MCTS.

    Args:
        pgn: Board state in PGN format
        wtime: White time remaining (milliseconds)
        btime: Black time remaining (milliseconds)

    Returns:
        Move in UCI format (e.g., "e2e4", "e1g1", "e7e8q")
    """
    # 1. Parse PGN to get board state
    board = chess.Board()
    # ... parse PGN moves

    # 2. Load cached NN model (lazy initialization)
    if model is None:
        model = load_model_from_huggingface()  # Downloads once, caches locally

    # 3. Calculate time budget for this move
    time_left = wtime if board.turn == chess.WHITE else btime
    move_time = calculate_time_budget(board, time_left)  # Dynamic allocation

    # 4. Run MCTS search
    # Key optimizations:
    # - Parallel search (4-8 workers)
    # - Transposition table caching
    # - Illegal move masking
    # - Early stopping if move is obvious
    root = MCTSNode(board)
    simulations = min(800, int(move_time / 50))  # Adaptive simulation count

    for _ in range(simulations):
        leaf = select_leaf(root)  # UCB selection
        if not leaf.is_terminal():
            # Batch inference optimization (collect multiple leaves)
            policy, value = model.evaluate(leaf.board)
            policy = mask_illegal_moves(policy, leaf.board)  # Critical!
            expand(leaf, policy)
            backup(leaf, value)

    # 5. Select best move
    best_move = root.best_child().move

    # 6. Return in UCI format
    return best_move.uci()

Critical Implementation Details:

Illegal move masking: Set policy to 0 for illegal moves, renormalize
Transposition table: Cache NN evaluations by board hash
Time management: Allocate more time for complex/critical positions
Parallel MCTS: Use virtual loss to prevent thread collisions
FP16 inference: 2x speedup with minimal accuracy loss

Deployment

# Push to GitHub
git add src/ serve.py requirements.txt
git commit -m "Update bot"
git push

# Deploy via ChessHacks dashboard
# - Connect GitHub repo
# - Assign to slot (1, 2, or 3)
# - Monitor build logs and ELO rating

📚 Key Resources

Repositories

Minimal LCZero - PyTorch implementation
ChessFormers - Transformer-based approach

Documentation

ChessHacks Docs - Platform documentation
Lichess Database - Training data source
python-chess - Chess library

Training Infrastructure

Modal - Cloud GPU training
HuggingFace Hub - Model weight storage

✅ Implementation Priorities

Goal: Queen's Crown (Highest ELO) - Target 2200+ ELO

Phase 1: Foundation (Hours 0-8)

Priority: Get something working

✅ Platform setup and documentation
Download high-quality dataset (Lichess 2200+ ELO, classical games)
Set up training pipeline (Minimal LCZero OR ChessFormers - pick one first)
Train small baseline model on Modal (~1-2 hours)
Implement basic /src/main.py with NN inference + simple MCTS
Deploy to Slot 1 - verify it works and doesn't crash

Phase 2: Critical Optimizations (Hours 8-20)

Priority: Must-have ELO gains (+500-800 ELO)

Illegal move masking (+100-200 ELO, prevents DQ)
Multi-task learning: Add WDL (result) head (+100-150 ELO)
Enhanced input: 16+ channel representation (+50-100 ELO)
Parallel MCTS: 4-8 workers with virtual loss (4-8x faster)
Dynamic time management: Complex positions get more time (+50-100 ELO)
Transposition table: Cache NN evaluations (+20-50 ELO)
FP16 quantization: 2x faster inference
Train improved model, deploy to Slot 2

Phase 3: Advanced Optimizations (Hours 20-30)

Priority: Nice-to-have ELO gains (+200-400 ELO)

Data augmentation: Horizontal flip (2x effective data)
Better value head: Separate win/draw/loss predictions
Opening optimization: Fast moves in opening (save 5-10s)
Model architecture: Tune depth/width for inference speed
Train second approach (whichever we didn't do in Phase 1)
Deploy best variant to Slot 3

Phase 4: Final Tuning (Hours 30-36)

Priority: Squeeze last ELO points

Hyperparameter tuning (c_puct, temperature, simulations)
A/B test different configurations across slots
Monitor ELO ratings, redeploy best performers
Self-play training if time permits (risky but high reward)

Key Decision Points

Hour 8: If baseline works, proceed to Phase 2. If not, debug.
Hour 20: Evaluate which model (LCZero vs ChessFormers) is stronger, focus effort there.
Hour 30: Lock in best model, only tune hyperparameters.

See DESIGN_DECISIONS.md for detailed optimization strategies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChessHacks Bot - Project Documentation

🎯 Competition Requirements

Core Rules

Platform Constraints

Technical Requirements

🧠 Strategy: Two Model Approaches

Approach 1: Minimal Leela Chess Zero (PyTorch)

Approach 2: ChessFormers

Engineering Philosophy

🏗️ Bot Architecture

Pipeline Flow

Key Components

📊 Training Pipeline

Data Source

Training Infrastructure

Dataset Preprocessing

☁️ Modal Training Setup

Quick Start

Deployment Strategy

📁 Project Structure

🚀 Development Workflow

Local Testing

Training on Modal

Bot Implementation (`/src/main.py`)

Deployment

📚 Key Resources

Repositories

Documentation

Training Infrastructure

✅ Implementation Priorities

Phase 1: Foundation (Hours 0-8)

Phase 2: Critical Optimizations (Hours 8-20)

Phase 3: Advanced Optimizations (Hours 20-30)

Phase 4: Final Tuning (Hours 30-36)

Key Decision Points

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

ChessHacks Bot - Project Documentation

🎯 Competition Requirements

Core Rules

Platform Constraints

Technical Requirements

🧠 Strategy: Two Model Approaches

Approach 1: Minimal Leela Chess Zero (PyTorch)

Approach 2: ChessFormers

Engineering Philosophy

🏗️ Bot Architecture

Pipeline Flow

Key Components

📊 Training Pipeline

Data Source

Training Infrastructure

Dataset Preprocessing

☁️ Modal Training Setup

Quick Start

Deployment Strategy

📁 Project Structure

🚀 Development Workflow

Local Testing

Training on Modal

Bot Implementation (/src/main.py)

Deployment

📚 Key Resources

Repositories

Documentation

Training Infrastructure

✅ Implementation Priorities

Phase 1: Foundation (Hours 0-8)

Phase 2: Critical Optimizations (Hours 8-20)

Phase 3: Advanced Optimizations (Hours 20-30)

Phase 4: Final Tuning (Hours 30-36)

Key Decision Points

Bot Implementation (`/src/main.py`)