A from-scratch neural network library implemented in C++ with CUDA support. This project demonstrates a Multi-Layer Perceptron applied to the XOR problem and MNIST digit classification, built without external ML frameworks.
A detailed discussion of the mathematical derivations and implementation can be found on my blog.
- Pure C++23 implementation with no dependencies (except CUDA for GPU support)
- Custom tensor library with broadcasting and matrix operations
- Manual backpropagation: Hand-coded gradient computation for all operations
- Neural network modules: Linear layers, ReLU activation, MLP
- Loss functions: Cross-entropy and binary cross-entropy
- Python prototype for validation
torchless-xor/
├── cpp/ # C++ implementation (primary)
├── python/ # Python reference implementation
└── data/ # Datasets (MNIST)
Requirements:
- C++23 compatible compiler (GCC 12+, Clang 15+)
- CMake 3.18+
- CUDA Toolkit (optional, for GPU support)
Build and run:
cd cpp
mkdir build && cd build
cmake ..
make
# Run examples
./xor
./mnist
# Run tests
./testsBuild with CUDA:
cmake -DUSE_CUDA=ON ..
makeRequirements:
- Python 3.13+
- Dependencies managed via uv
Run:
cd prototype
uv sync
# Execute training scripts
python -m src.xor
python -m src.mnistTrains a simple MLP to learn the XOR function with noise tolerance.
Trains a neural network on the MNIST handwritten digit dataset.
For a detailed walkthrough of the implementation, including mathematical derivations and design decisions, visit: jakobkaiser.com/blog/torchless-xor-mnist