Training, Compiling, Testing and Analysis

Theoria is a modified Stockfish 16.1 chess engine paired with a custom NNUE neural network trained from scratch using the official nnue-pytorch trainer. A key modification disables Stockfish's "small net" NNUE, ensuring that only the custom-trained big net (L1=2560, HalfKAv2_hm^) is used for evaluation regardless of position complexity or time constraints. This document outlines the complete development process from training environment through final binary compilation.

Training Environment

Hardware:

Software:

NNUE-PyTorch Trainer

Training was performed using the official nnue-pytorch repository with a specific commit that enables SFNNv8 architecture compatibility:

This modification sets the L1 layer size to 2560, which is the architecture used in Stockfish 16.1's SFNNv8 neural network. This ensures trained networks are compatible with Stockfish 16.1 binaries.

Training Data

The neural network was trained on the Leela96 filtered v2 dataset, a high-quality collection of positions derived from Leela Chess Zero self-play:

Training Command

The following command was used to train the neural network for 150 epochs:

python train.py \
  /home/user/bins/train/leela96.binpack \
  /home/user/bins/validation/leela96.binpack \
  --gpus "0," --threads 8 --num-workers 8 --batch-size 8192 \
  --random-fen-skipping 3 --features=HalfKAv2_hm^ \
  --network-save-period=1 --max_epochs=150 \
  --default_root_dir /home/user/output/theoria16_HalfKAv2_hm__leela96

Key parameters:

Epoch Evaluation Process

All 150 trained epochs were evaluated using cutechess-cli in a multi-stage tournament process to identify the strongest network:

Stage 1 — Initial Gauntlet: 26 candidate epochs were tested against a baseline epoch, playing 100 games each. Top performers were identified by Elo gain.

Stage 2 — Round Robin: The top 8 epochs from Stage 1 competed against each other in a round-robin format, playing approximately 1,750 games each.

Stage 3 — Peak Search: Based on round-robin results, the search narrowed to the best-performing epoch range. Four epochs in this range played 6,000 games each.

Stage 4 — Final Selection: The top 2 epochs measured using a final SPRT comparison. Epoch 139 emerged as the winner.

Results:

Stockfish 16.1 Fork

Theoria is built on a modified Stockfish 16.1 codebase. The following sections detail the changes made in each release.

Version 0.1 — Initial Release

The initial release disables the "small net" NNUE that Stockfish normally uses for quick evaluation of simple positions. In Theoria, only the big net (L1=2560) is used for all evaluations, ensuring consistent analytical behavior. The Makefile was also modified to prevent automatic downloading of Stockfish's default NNUE file, allowing the build scripts to inject the custom-trained Epoch 139 network at compile time.

Version 0.2 — Stability-Based Early Termination

Version 0.2 introduces stability-based early termination to the iterative deepening search loop. During timed searches, the engine monitors whether the evaluation has converged across consecutive depth iterations. If the score remains within a scaled threshold for two consecutive iterations after reaching a minimum depth, the engine terminates early and returns the principal variation from the point where the evaluation first stabilized. This avoids spending time on positions the engine has already solved, allowing unused time to be applied to more complex positions later.

Three new UCI options control this behavior:

The stability check uses a scaled comparison: delta * 100 / max(abs(eval), 25), where delta is the absolute difference between the current and previous iteration scores. This scaling ensures the threshold adapts proportionally to the magnitude of the evaluation — a 15-centipawn shift matters more in a near-equal position than in one that is already decisive.

Compilation Process

Custom build scripts handle NNUE injection and compilation for both platforms:

Both scripts:

Analysis Package

An analysis package is available demonstrating Theoria's evaluation capabilities:

Download analysis.zip

Contents:

Download & Source Code

Theoria source code is available for download and includes full GPL v3 compliance documentation:

Visit the Download page for binary releases and source archives.

Summary

Theoria represents a complete ground-up NNUE training effort paired with a modified Stockfish 16.1 engine. Starting with the Leela96 filtered v2 dataset and the official nnue-pytorch trainer configured for SFNNv8 architecture (L1=2560), 150 epochs were trained on consumer hardware (RTX 3050). A rigorous multi-stage evaluation process involving over 35,000 games identified Epoch 139 as the optimal network. The engine modification disables the small net NNUE, ensuring all positions are evaluated exclusively by the custom-trained big net.

Version 0.2 adds stability-based early termination to the search, allowing the engine to stop iterating when evaluation has converged. Three configurable UCI options control the behavior, and the implementation preserves UCI protocol compliance by excluding infinite and pondering searches.

The project demonstrates that meaningful NNUE research and development is accessible to individual developers with modest hardware, provided they apply systematic evaluation methodology and leverage the excellent tooling maintained by the open-source community.