This project implements a simple Multilayer Perceptron (MLP), a fundamental type of neural network commonly used in machine learning for classification tasks. Built from scratch using NumPy, it demonstrates core concepts like forward/backward propagation, gradient descent, and various activation functions—without relying on high-level frameworks like TensorFlow or PyTorch.
- Custom Neural Network: Fully-connected layers with configurable architecture
- Multiple Activation Functions: ReLU, Sigmoid, Softmax
- Optimizers: Gradient Descent, Adam
- Regularization: L1 (Lasso) and L2 (Ridge) regularization
- Training Features: Mini-batch training, early stopping, gradient clipping
- Configuration: Model setup via
.pklconfiguration files - Visualization: Automatic metric plotting (loss, accuracy, precision, recall)
- Fast Setup: Uses
uvfor lightning-fast dependency management
- Python 3.10+
- uv (recommended) or pip
git clone https://github.com/lucas-ht/multilayer-perceptron.git
cd multilayer-perceptron
uv syncFor detailed configuration options and advanced usage, see the complete documentation.
uv run mlp.py split data/data.csv --test-size 0.2uv run mlp.py train data/train.csv model.test.pkluv run mlp.py predict data/test.csv models/model.jsonThe MLP consists of:
- Input Layer: Accepts feature vectors from the dataset
- Hidden Layers: Learns non-linear representations using activation functions
- Output Layer: Produces predictions (typically with softmax for classification)
Each layer performs:
-
Linear transformation:
$z = Wx + b$ -
Activation:
$a = \sigma(z)$ - Backpropagation: Gradients computed via chain rule
- Forward propagation: Pass inputs through layers to compute predictions
- Loss calculation: Measure error using cross-entropy loss
- Backpropagation: Compute gradients of loss w.r.t. weights
- Weight update: Apply optimizer (GD or Adam) to minimize loss
- Regularization: Optional L1/L2 penalties to prevent overfitting
Key hyperparameters (configurable in .pkl files):
- Learning rate
- Batch size
- Number of epochs
- Layer sizes and activation functions
- Regularization strength
- Early stopping patience
This project is part of the 42 School curriculum. Special thanks to the 42 School community for their support and resources.