Training, Compiling, Testing and Analysis

Theoria16 version 0.1 is a modified Stockfish 16.1 chess engine paired with a custom NNUE neural network trained from scratch using the official nnue-pytorch trainer. A key modification disables Stockfish's "small net" NNUE, ensuring that only the custom-trained big net (L1=2560, HalfKAv2_hm^) is used for evaluation regardless of position complexity or time constraints. This document outlines the complete development process from training environment through final binary compilation.

Training Environment

Hardware:

Software:

NNUE-PyTorch Trainer

Training was performed using the official nnue-pytorch repository with a specific commit that enables SFNNv8 architecture compatibility:

This modification sets the L1 layer size to 2560, which is the architecture used in Stockfish 16.1's SFNNv8 neural network. This ensures trained networks are compatible with Stockfish 16.1 binaries.

Training Data

The neural network was trained on the Leela96 filtered v2 dataset, a high-quality collection of positions derived from Leela Chess Zero self-play:

Training Command

The following command was used to train the neural network for 150 epochs:

python train.py \
  /home/user/bins/train/leela96.binpack \
  /home/user/bins/validation/leela96.binpack \
  --gpus "0," --threads 8 --num-workers 8 --batch-size 8192 \
  --random-fen-skipping 3 --features=HalfKAv2_hm^ \
  --network-save-period=1 --max_epochs=150 \
  --default_root_dir /home/user/output/theoria16_HalfKAv2_hm__leela96

Key parameters:

Epoch Evaluation Process

All 150 trained epochs were evaluated using cutechess-cli in a multi-stage tournament process to identify the strongest network:

Stage 1 — Initial Gauntlet: 26 candidate epochs were tested against a baseline epoch, playing 100 games each. Top performers were identified by Elo gain.

Stage 2 — Round Robin: The top 8 epochs from Stage 1 competed against each other in a round-robin format, playing approximately 1,750 games each.

Stage 3 — Peak Search: Based on round-robin results, the search narrowed to the best-performing epoch range. Four epochs in this range played 6,000 games each.

Stage 4 — Final Selection: The top 2 epochs measured using a final SPRT comparison. Epoch 139 emerged as the winner.

Results:

Stockfish 16.1 Fork

Theoria16 is built on a modified Stockfish 16.1 codebase. A critical modification disables the "small net" NNUE that Stockfish normally uses for quick evaluation of simple positions. In Theoria16, only the big net (L1=2560) is used for all evaluations, ensuring consistent analytical behavior. The following changes were made:

The key modification disables the Makefile's automatic download of Stockfish's default NNUE file, allowing the compile scripts to inject the custom-trained Epoch 139 network at build time.

Compilation Process

Custom build scripts handle NNUE injection and compilation for both platforms:

Both scripts:

Analysis Package

An analysis package is available demonstrating Theoria16's evaluation capabilities:

Download analysis.zip

Contents:

Download & Source Code

Theoria16 source code is available for download and includes full GPL v3 compliance documentation:

Visit the Download page for binary releases and source archives.

Summary

Theoria16 version 0.1 represents a complete ground-up NNUE training effort paired with a modified Stockfish 16.1 engine. Starting with the Leela96 filtered v2 dataset and the official nnue-pytorch trainer configured for SFNNv8 architecture (L1=2560), 150 epochs were trained on consumer hardware (RTX 3050). A rigorous multi-stage evaluation process involving over 35,000 games identified Epoch 139 as the optimal network. The engine modification disables the small net NNUE, ensuring all positions are evaluated exclusively by the custom-trained big net.

The project demonstrates that meaningful NNUE research and development is accessible to individual developers with modest hardware, provided they apply systematic evaluation methodology and leverage the excellent tooling maintained by the open-source community.