Skip to content

alexxtaurus83/AttaxTensorflow

Repository files navigation

This is my firts try to build project using Neural Network based on TensorFlow.NET. This project at earlier Alfa stage of development. 80% of this code was generated by Gemini AI using chat capabilities. Project developed using Windsurf AI IDE (https://windsurf.com) Please use with caution. There a lot of issues and bugs (But project is compilable and can be run). Primilay scope of the issues:

  • TensorFlow.NET and dependent packages compatibility.
  • Code need to be tested to prevent RAM and GPU memory leaks.
  • Initial traning data file size can be ~20gb. And need to be optimised

1. High-Level Summary

This document provides a technical overview of the Ataxx AI Training Solution, a distributed multi-project system designed to train a high-performance game-playing agent for the game of Ataxx.

The architecture follows a modern, AlphaZero-like approach, creating a self-improvement "flywheel." In this loop, the AI generates its own training data through self-play, a trainer consumes this data to produce stronger models, and an evaluator promotes the best-performing models. This process allows the AI to continuously improve its strategic understanding of the game.

The system is composed of four distinct but interconnected projects, communicating through a central web API and a shared file system, enabling it to scale across multiple machines for efficient, parallelized training.


2. System Architecture & Training Workflow

The solution operates on a distributed Controller/Worker pattern. The core components work in a continuous cycle:

  1. Self-Play: SelfPlayWorker instances fetch the current best model from the Controller. They play thousands of games against themselves, using a Monte Carlo Tree Search (MCTS) guided by the model's predictions.
  2. Data Aggregation: The results of every game—each move, the board state, the MCTS policy, and the final game outcome—are logged as (State, Policy, Value) tuples into a central training_data.jsonl file on a shared drive.
  3. Training: The Trainer application continuously monitors the shared drive. It loads the new game data, preprocesses it into tensors, and uses it to train the neural network, producing a new candidate model.
  4. Evaluation: The Controller detects the new candidate model. It orchestrates a head-to-head match of ~100 games between the current best model and the new candidate.
  5. Promotion: If the candidate model wins the evaluation match by a statistically significant margin, the Controller promotes it to become the new best model.

This cycle then repeats, with the now-stronger model generating higher-quality data for the next round of training.


3. Project Breakdown

3.1. Ataxx.Core (The Foundation)

This is the foundational class library shared by all other projects in the solution. It contains the essential logic and data structures for the game and the AI.

  • Purpose: To provide a single, reusable engine for game mechanics, AI search, and neural network interaction.
  • Key Components: AtaxxLogic, BitboardState, MctsEngine, MCTSNode, PredictionService, TrainingGameLog.

3.2. Ataxx.SelfPlayWorker (The Data Generator)

This is a console application responsible for the "Self-Play" phase of the training loop. Multiple instances can be run in parallel.

  • Purpose: To generate high-quality training data by playing games using the current best AI model.
  • Key Components: SelfPlayJob, GameSimulator.
  • Interactions: Calls the Ataxx.Controller API to get the latest model and writes game logs to the shared drive.

3.3. Ataxx.Trainer (The Learner)

This console application is the heart of the learning process, designed to run on a machine with a powerful GPU.

  • Purpose: To train new, improved neural network models from the data generated by the SelfPlayWorkers.
  • Key Components: ModelTrainer, DataPreprocessor, TrainingJob.
  • Interactions: Reads game logs from the shared drive and writes new _candidate models back to it.

3.4. Ataxx.Controller (The Coordinator)

This is an ASP.NET Core web application that acts as the central coordinator for the entire distributed system.

  • Purpose: To manage the model registry and orchestrate the evaluation process.
  • Key Components: ModelController, ModelRegistryService, EvaluationJob.
  • Interactions: Manages model files on the shared drive and responds to API requests from workers.

4. Neural Network Architecture

The "brain" of the AI is a deep neural network, implemented in Ataxx.Trainer, with an architecture inspired by AlphaZero. This design allows the network to learn complex spatial patterns and game strategies directly from the board state.

  • Input Tensor: The network takes a 7x7x4 tensor as input, representing the complete state of the game from the current player's perspective.

    • Channel 1: A plane with 1s representing the current player's pieces, 0s otherwise.
    • Channel 2: A plane with 1s representing the opponent's pieces, 0s otherwise.
    • Channel 3: A plane indicating the positions of permanently blocked squares.
    • Channel 4: A plane filled entirely with a constant value indicating whose turn it is, providing the model with context.
  • Network Body: The core of the network consists of several convolutional layers (Conv2D). These layers are exceptionally effective at recognizing spatial patterns and relationships between pieces on the 7x7 game board.

  • Dual-Output Heads: The network has two distinct outputs, which are trained simultaneously:

    1. Policy Head: A vector of 1176 probabilities (49 'from' squares * 24 possible moves), processed through a Softmax activation. This head predicts the probability distribution of the best possible moves from the current state. It is used by the MCTS engine to guide its search towards more promising actions.
    2. Value Head: A single scalar value, processed through a Tanh activation to be between -1 and 1. This head predicts the expected outcome of the game from the current state (-1 = likely loss, +1 = likely win). This is used to evaluate leaf nodes in the MCTS, replacing the need for random rollouts.

5. Distributed Deployment Scenarios

The system is designed for flexible deployment across multiple machines to maximize training efficiency. Communication is handled via the ASP.NET Core API for control and a shared network drive (e.g., a Samba share) for high-volume data transfer.

5.1. Three-Machine Deployment (Maximum Performance)

This is the ideal configuration, assigning specialized roles to each machine.

  • Machine #1 (24 Cores CPU Xeon): Controller & Data Hub
    • Responsibilities: Hosts the Ataxx.Controller API, manages the shared data drive, and runs CPU-based instances of the Ataxx.SelfPlayWorker to contribute to data generation.
  • Machine #2 (Mid level GPU): Self-Play & Evaluation Worker
    • Responsibilities: Its primary role is to run the Ataxx.SelfPlayWorker, leveraging its GPU for fast MCTS rollouts. Its secondary role is to perform the evaluation matches between candidate and best models when tasked by the Controller.
  • Machine #3 (High-spec GPU): Primary Training Worker
    • Responsibilities: This machine's sole focus is running the Ataxx.Trainer application. It continuously ingests data from the shared drive and uses its powerful GPU for the heavy-lifting of network training.

5.2. Two-Machine Deployment (Simplified & Flexible)

The system is fully functional in a two-machine setup, consolidating roles effectively.

  • Machine #1 (24 Cores CPU Xeon): The "Controller & Thinker"
    • Responsibilities: Runs the Ataxx.Controller API, hosts the shared drive, runs CPU-based SelfPlayWorker instances, and takes on the role of the Evaluation Machine to compare models.
  • Machine #2 (High-spec GPU): The "Trainer & Power-Player"
    • Responsibilities: Runs the Ataxx.Trainer application to handle all network training. It also runs the Ataxx.SelfPlayWorker, using its GPU to generate high-quality game data at high speed.

6. How to Run the System

To start the full AI training pipeline, the applications should be launched in the following order:

  1. Start the Controller: Run the Ataxx.Controller web application on the designated machine.
  2. Start the Trainer: On the primary GPU machine, run the Ataxx.Trainer console application.
  3. Start the Self-Play Workers: On all participating machines, run instances of the Ataxx.SelfPlayWorker console application.

Once all components are running, the system is fully operational and will autonomously work to improve the Ataxx AI.

About

Attax game clone. using Neural Network based on TensorFlow.NET

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages