Training, Compiling, Testing and Analysis
Theoria16 version 0.1 is a modified Stockfish 16.1 chess engine paired with a custom NNUE neural network trained from scratch using the official nnue-pytorch trainer. A key modification disables Stockfish's "small net" NNUE, ensuring that only the custom-trained big net (L1=2560, HalfKAv2_hm^) is used for evaluation regardless of position complexity or time constraints. This document outlines the complete development process from training environment through final binary compilation.
Training Environment
Hardware:
- GPU: NVIDIA RTX 3050 (8GB VRAM)
- CPU: Intel i7-7700K
- RAM: 64GB
- Storage: SSD
Software:
- Python: 3.12.3
- PyTorch: 2.2.0+cu121
- PyTorch Lightning: 1.9.5
- CUDA: 12.1
- NumPy: 1.26.4
- TorchVision: 0.17.0+cu121
- TorchAudio: 2.2.0+cu121
- TorchMetrics: 1.8.2
NNUE-PyTorch Trainer
Training was performed using the official nnue-pytorch repository with a specific commit that enables SFNNv8 architecture compatibility:
- Repository: official-stockfish/nnue-pytorch
- Commit:
aeeffdb(September 25, 2023) - Author: Linmiao Xu
- Change: "Increase L1 size to 2560 for SFNNv8 nets"
This modification sets the L1 layer size to 2560, which is the architecture used in Stockfish 16.1's SFNNv8 neural network. This ensures trained networks are compatible with Stockfish 16.1 binaries.
Training Data
The neural network was trained on the Leela96 filtered v2 dataset, a high-quality collection of positions derived from Leela Chess Zero self-play:
- Dataset:
leela96-filter-v2.min.binpack - Source: robotmoon.com/nnue-training-data (June 22, 2023)
- Download: Kaggle - linrock/leela96-filt-v2-min
Training Command
The following command was used to train the neural network for 150 epochs:
python train.py \
/home/user/bins/train/leela96.binpack \
/home/user/bins/validation/leela96.binpack \
--gpus "0," --threads 8 --num-workers 8 --batch-size 8192 \
--random-fen-skipping 3 --features=HalfKAv2_hm^ \
--network-save-period=1 --max_epochs=150 \
--default_root_dir /home/user/output/theoria16_HalfKAv2_hm__leela96
Key parameters:
--features=HalfKAv2_hm^— HalfKAv2 feature set with horizontal mirroring--batch-size 8192— Optimized for 8GB VRAM--network-save-period=1— Save checkpoint every epoch for evaluation--random-fen-skipping 3— Data augmentation via position skipping
Epoch Evaluation Process
All 150 trained epochs were evaluated using cutechess-cli in a multi-stage tournament process to identify the strongest network:
Stage 1 — Initial Gauntlet: 26 candidate epochs were tested against a baseline epoch, playing 100 games each. Top performers were identified by Elo gain.
Stage 2 — Round Robin: The top 8 epochs from Stage 1 competed against each other in a round-robin format, playing approximately 1,750 games each.
Stage 3 — Peak Search: Based on round-robin results, the search narrowed to the best-performing epoch range. Four epochs in this range played 6,000 games each.
Stage 4 — Final Selection: The top 2 epochs measured using a final SPRT comparison. Epoch 139 emerged as the winner.
Results:
- Total games played: ~35,000+
- Draw rate: 85-87% (indicating stable, high-quality play)
- Selected epoch: 139
Stockfish 16.1 Fork
Theoria16 is built on a modified Stockfish 16.1 codebase. A critical modification disables the "small net" NNUE that Stockfish normally uses for quick evaluation of simple positions. In Theoria16, only the big net (L1=2560) is used for all evaluations, ensuring consistent analytical behavior. The following changes were made:
c19061e— Initial commit: Theoria16 source code (forked from SF 16.1, disabled small net)98024fe— Changed banner to Theoria 0.115912f6— Fix Makefile to remove net target, add NNUE filesd89ef61— Remove LTO build artifacts9044504— Fixed Makefile to not auto-download SF NNUE at build1daa86b— Created set_version.sh for easy version edits2623f5c— Update project files for Theoria fork
The key modification disables the Makefile's automatic download of Stockfish's default NNUE file, allowing the compile scripts to inject the custom-trained Epoch 139 network at build time.
Compilation Process
Custom build scripts handle NNUE injection and compilation for both platforms:
- Linux: compile-theoria-linux.txt — Uses GCC with profile-guided optimization (PGO)
- Windows: compile-theoria-windows.txt — Uses MSYS2 MINGW64 with Clang
Both scripts:
- Copy the selected NNUE epoch to the expected filename in src/
- Build optimized binaries for AVX2 and/or BMI2 architectures
- Clean up temporary files after compilation
- Output named binaries with version and epoch information
Analysis Package
An analysis package is available demonstrating Theoria16's evaluation capabilities:
Contents:
- 100 random games from Lichess (Elo 1450-1550 rated players)
- Original PGN without engine annotations
- Engine-analyzed PGN with evaluations and principal variations
- Python scripts used to perform the automated analysis
Download & Source Code
Theoria16 source code is available for download and includes full GPL v3 compliance documentation:
- AUTHORS — Full attribution for Stockfish contributors
- Copying.txt — GNU General Public License v3.0
- changelog.txt — Complete commit history for Theoria16 modifications
Visit the Download page for binary releases and source archives.
Summary
Theoria16 version 0.1 represents a complete ground-up NNUE training effort paired with a modified Stockfish 16.1 engine. Starting with the Leela96 filtered v2 dataset and the official nnue-pytorch trainer configured for SFNNv8 architecture (L1=2560), 150 epochs were trained on consumer hardware (RTX 3050). A rigorous multi-stage evaluation process involving over 35,000 games identified Epoch 139 as the optimal network. The engine modification disables the small net NNUE, ensuring all positions are evaluated exclusively by the custom-trained big net.
The project demonstrates that meaningful NNUE research and development is accessible to individual developers with modest hardware, provided they apply systematic evaluation methodology and leverage the excellent tooling maintained by the open-source community.