(image generated with DALL·E 2)
Caissa is a strong, UCI-compatible chess engine written from scratch in C++ since early 2021. It features a custom neural network evaluation system trained on over 17 billion self-play positions, achieving ratings of 3600+ ELO on major chess engine rating lists, placing it at around top-10 spot.
The engine is optimized for:
- Regular Chess - Standard chess rules
- FRC (Fischer Random Chess) - Chess960 variant
- DFRC (Double Fischer Random Chess) - Extended FRC variant
- Playing Strength
- Features
- Quick Start
- Compilation
- Architecture Variants
- UCI Options
- History & Originality
- Project Structure
- License
Caissa consistently ranks among the top chess engines on major rating lists:
| List | Rating | Rank | Version | Notes |
|---|---|---|---|---|
| CCRL 40/2 FRC | 4022 | #6 | 1.23 | Fischer Random Chess |
| CCRL Chess324 | 3770 | #6 | 1.23 | Chess324 variant |
| CCRL 40/15 | 3622 | #9 | 1.23 | 4 CPU |
| CCRL Blitz | 3755 | #10 | 1.22 | 8 CPU |
| List | Rating | Rank | Version |
|---|---|---|---|
| SPCC UHO-Top15 | 3697 | #10 | Caissa 1.24 avx512 |
| List | Rating | Rank | Version | Architecture |
|---|---|---|---|---|
| 10+1 (R9-7945HX) | 3542 | #16 | 1.24 | AVX-512 |
| 10+1 (i9-7980XE) | 3526 | #14 | 1.21 | AVX-512 |
| 10+1 (i9-13700H) | 3544 | #17 | 1.22 | AVX2-BMI2 |
| List | Rating | Rank | Version |
|---|---|---|---|
| CEGT 40/20 | 3576 | #8 | 1.24 |
| CEGT 40/4 | 3614 | #8 | 1.22 |
| CEGT 5+3 | 3618 | #5 | 1.22 |
Note: The rankings above may be outdated.
- ✅ UCI Protocol - Full Universal Chess Interface support
- ✅ Neural Network Evaluation - Custom NNUE-style evaluation
- ✅ Endgame Tablebases - Syzygy and Gaviota support
- ✅ Chess960 Support - Fischer Random Chess (FRC) and Double FRC
- ✅ Negamax with alpha-beta pruning
- ✅ Iterative Deepening with aspiration windows
- ✅ Principal Variation Search (PVS)
- ✅ Quiescence Search for tactical positions
- ✅ Transposition Table with large pages support
- ✅ Multi-PV Search - Analyze multiple lines simultaneously
- ✅ Multithreaded Search - Parallel search with shared TT
- Architecture: (32×768→1024)×2→1
- Incremental Updates - Efficiently updated first layer
- Vectorized Code - Manual SIMD optimization for:
- AVX-512 (fastest)
- AVX2
- SSE2
- ARM NEON
- Activation: Clipped-ReLU
- Variants: 8 variants of last layer weights (piece count dependent)
- Features: Absolute piece coordinates with horizontal symmetry, 32 king buckets
- Special Endgame Routines - Enhanced endgame evaluation
- Custom CPU-based Trainer using Adam algorithm
- Highly Optimized - Exploits AVX instructions, multithreading, and network sparsity
- Self-Play Training - Trained on 17+ billion positions from self-generated games
- Progressive Training - Older games purged, networks trained on latest engine versions
- Magic Bitboards - Efficient move generation
- Large Pages - Transposition table uses large pages for better performance
- Node Caching - Evaluation result caching
- Accumulator Caching - Neural network accumulator caching
- Ultra-Fast - Outstanding performance at ultra-short time controls (sub-second games)
- Download the appropriate executable from the Releases page
- Choose the version matching your CPU:
- AVX-512: Latest Intel Xeon/AMD EPYC (fastest)
- BMI2: Most modern CPUs (recommended)
- AVX2: Older CPUs with AVX2 support
- POPCNT: Older CPUs with SSE4.2
- Legacy: Very old x64 CPUs
- Run the engine with any UCI-compatible chess GUI
See the Compilation section below for detailed build instructions.
- C++ Compiler with C++20 support:
- GCC 10+ or Clang 12+ (Linux)
- Visual Studio 2022 (Windows)
- CMake 3.15 or later
- Make (Linux) or Visual Studio (Windows)
cd src
make -j$(nproc)Note: This compiles the default AVX2/BMI2 version.
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Final ..
make -j$(nproc)Build Configurations:
Final- Production build, no asserts, maximum optimizationsRelease- Development build with asserts, optimizations enabledDebug- Development build with asserts, optimizations disabled
Architecture Selection:
To build for a specific architecture, set the TARGET_ARCH variable:
# AVX-512 (requires AVX-512 support)
cmake -DTARGET_ARCH=x64-avx512 -DCMAKE_BUILD_TYPE=Final ..
# BMI2 (recommended for modern CPUs)
cmake -DTARGET_ARCH=x64-bmi2 -DCMAKE_BUILD_TYPE=Final ..
# AVX2
cmake -DTARGET_ARCH=x64-avx2 -DCMAKE_BUILD_TYPE=Final ..
# SSE4-POPCNT
cmake -DTARGET_ARCH=x64-sse4-popcnt -DCMAKE_BUILD_TYPE=Final ..
# Legacy (fallback)
cmake -DTARGET_ARCH=x64-legacy -DCMAKE_BUILD_TYPE=Final ..- Run
GenerateVisualStudioSolution.batto generate the Visual Studio solution - Open
build_<arch>/caissa.slnin Visual Studio 2022 - Select the desired configuration (Debug/Release/Final)
- Build the solution (Ctrl+Shift+B)
Note: Visual Studio 2022 is the only tested version. CMake directly in Visual Studio has not been tested.
After compilation, copy the appropriate neural network file from data/neuralNets/ to:
- Linux:
build/bin/ - Windows:
build\bin\x64\<Configuration>\
| Variant | CPU Requirements | Performance | Recommended For |
|---|---|---|---|
| AVX-512 | AVX-512 instruction set | Fastest | Latest Intel Xeon, AMD EPYC |
| BMI2 | AVX2 + BMI2 | Fast | Most modern CPUs (2015+) |
| AVX2 | AVX2 instruction set | Fast | Intel Haswell, AMD Ryzen |
| POPCNT | SSE4.2 + POPCNT | Moderate | Older CPUs (2008-2014) |
| Legacy | x64 only | Slowest | Very old x64 CPUs |
Tip: If unsure, try BMI2 first. It's supported by most modern CPUs and offers excellent performance.
The engine supports the following UCI options:
- Hash (int) - Transposition table size in megabytes
- Threads (int) - Number of search threads
- MultiPV (int) - Number of principal variation lines to search
- Ponder (bool) - Enable pondering mode
- MoveOverhead (int) - Move overhead in milliseconds (increase if engine loses time)
- EvalFile (string) - Path to neural network evaluation file (
.pnn) - EvalRandomization (int) - Evaluation randomization range (weakens engine, introduces non-determinism)
- SyzygyPath (string) - Semicolon-separated paths to Syzygy tablebases
- SyzygyProbeLimit (int) - Maximum number of pieces for tablebase probing
- UCI_AnalyseMode (bool) - Analysis mode (full PV lines, no depth constraints)
- UCI_Chess960 (bool) - Enable Chess960 mode (castling as "king captures rook")
- UCI_ShowWDL (bool) - Show win/draw/loss probabilities with evaluation
- UseSAN (bool) - Use Standard Algebraic Notation (FIDE standard)
- ColorConsoleOutput (bool) - Enable colored console output
Caissa has been written from the ground up since early 2021. The development journey:
- Early Versions - Used simple PeSTO evaluation
- Version 0.6 - Temporarily used Stockfish NNUE
- Version 0.7+ - Custom neural network evaluation system
The engine's neural network has evolved significantly:
- Initial Network: Based on Stockfish's architecture, trained on a few million positions
- Current Network (v1.24+): Trained on 17+ billion positions from self-play
- Progressive Training: Older games are purged, ensuring networks are trained only on the latest engine versions
-
Runtime Evaluation:
PackedNeuralNetwork.cpp- Inspired by nnue.md
- Highly optimized with manual SIMD vectorization
-
Network Trainer:
NetworkTrainer.cpp,NeuralNetwork.cpp- Written completely from scratch
- CPU-based, heavily optimized with AVX and multithreading
- Exploits network sparsity for performance
-
Self-Play Generator:
SelfPlay.cpp- Generates games with fixed nodes/depth
- Custom binary format for efficient storage
- Uses Stefan's Pohl UHO books or DFRC openings
The project is organized into three main modules:
src/
├── backend/ # Core engine library
│ ├── Search.* # Search algorithms
│ ├── Position.* # Position representation
│ ├── MoveGen.* # Move generation
│ ├── PackedNeuralNetwork.* # Neural network evaluation
│ ├── TranspositionTable.* # Position caching
│ └── ...
│
├── frontend/ # UCI interface executable
│ ├── Main.cpp # Entry point
│ └── UCI.* # UCI protocol implementation
│
└── utils/ # Development and training tools
├── NetworkTrainer.* # Neural network training
├── SelfPlay.* # Self-play game generation
├── Tests.* # Unit tests
└── ...
- backend (library) - Engine core: search, evaluation, move generation, position management
- frontend (executable) - UCI wrapper providing command-line interface
- utils (executable) - Utilities: network trainer, self-play generator, unit tests, performance tests
This project is licensed under the MIT License - see the LICENSE file for details.
Author: Michał Witanowski
Started: Early 2021
Language: C++20
License: MIT
