TestingStrategy

Testing Strategy: Beyond the "Bullet Monster"

1. Executive Summary

The development of modern chess engines is dominated by the "Fishtest" approach—a distributed testing framework that validates changes through millions of games played at extremely fast time controls.

While this has produced an engine (Stockfish) of incredible strength, it has inadvertently created a "Bullet Monster": an engine optimized for ultra-short time controls, heavily reliant on aggressive pruning to make instant decisions.

Fishtest vs ShashChess

ShashChess adopts a fundamentally different philosophy. Our goal is to create the strongest possible engine for non-ultra-rapid time controls (Classical and Rapid).

2. The Limitation of Fishtest: The "Bullet Monster" Bias

Stockfish’s development is driven by Short Time Control (STC) and "LTC" (Long Time Control) tests that, by human standards, are still blitz games. This creates a systemic bias:

The Pruning Problem: To survive these speeds, Stockfish must aggressively "prune" (discard) search branches. If a move doesn't look promising immediately, it is ignored.
The Horizon Effect: In complex positions requiring deep strategic understanding or long tactical sequences, this aggressive pruning causes the engine to miss critical variations that only reveal themselves after deeper calculation.
The Result: An engine that is unbeatable at 10 seconds per game but lacks the "patience" and depth required for the highest quality analysis at 5 or 20 minutes per move.

3. The ShashChess Methodology: The "Nugget" Concept

ShashChess rejects the reliance on random starting positions, which often lead to high draw rates and low information gain. Instead, we implement the concept of "Nuggets"—a methodology detailed in our research paper "Fewer Draws, More Fun: Searching for Unbalanced Positions in Chess".

What is a Nugget?

A Nugget is a scientifically selected starting position with a win probability of approximately 75% (or 25% for Black). These positions are inherently unbalanced, forcing the engine to navigate complex scenarios rather than steering toward a "safe draw".

Scientific Validation

This approach not only reduces draw rates but provides a more accurate framework for ranking engine strength in decisive games.

ICAICS Conference

This research was awarded First Prize at the Conference on Artificial Intelligence and Cognitive Science (ICAICS).

Official Publication: Springer Link: Fewer Draws, More Fun
Presentation: Below is the presentation of the paper at the conference.

ECAI Presentation

4. Test Protocols & Credits

To validate ShashChess 41, we utilized two distinct rigorous testing environments, made possible by the contributions of our community experts.

A. The Match Play Protocol (Strategic Strength)

We conducted a 300-game match starting from 50 selected Nuggets. This ensures the engine is tested on unbalanced, high-stakes positions rather than drawish opening books.

Curator: Thanks to Afro Ambanelli for selecting the 50 Nuggets and orchestrating the match parameters.
Data Source: 📂 View the Match PGN

B. The Tactical Suite (Calculation Depth)

To test raw tactical aptitude without the noise of game strategy, we utilized a refined suite of 325 complex positions. These positions are specifically chosen to punish engines that prune too aggressively.

Curator: Thanks to Peter Martan for compiling and verifying this challenging suite.
Data Source: 📂 View the 325 Positions (EPD)

5. Acknowledgments

This project and its professional validation would not have been possible without the support of:

Prof. Paolo Ciancarini (University of Bologna) for his academic guidance and contribution to the theoretical framework.
Intel Italia for the critical hardware support; validating a chess engine with this level of statistical confidence requires enterprise-grade computing power.
Ing. Massimo Venuto for his technical engineering contributions.
The "Giants" met at the Santiago 2024 competition, who proved that the chess programming world is not just composed of sects of hypertrophic egos, but of genuine innovators and collaborators.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestingStrategy

Testing Strategy: Beyond the "Bullet Monster"

1. Executive Summary

2. The Limitation of Fishtest: The "Bullet Monster" Bias

3. The ShashChess Methodology: The "Nugget" Concept

What is a Nugget?

Scientific Validation

4. Test Protocols & Credits

A. The Match Play Protocol (Strategic Strength)

B. The Tactical Suite (Calculation Depth)

5. Acknowledgments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally