Skip to content

Implementation of a multilayer perceptron for classification tasks, as part of the 42 School multilayer-perceptron project.

Notifications You must be signed in to change notification settings

lucas-ht-42/multilayer-perceptron

Repository files navigation

multilayer-perceptron
multilayer-perceptron: simple neural network

Python Pkl

This project implements a simple Multilayer Perceptron (MLP), a fundamental type of neural network commonly used in machine learning for classification tasks. Built from scratch using NumPy, it demonstrates core concepts like forward/backward propagation, gradient descent, and various activation functions—without relying on high-level frameworks like TensorFlow or PyTorch.

Table of Contents

Features

  • Custom Neural Network: Fully-connected layers with configurable architecture
  • Multiple Activation Functions: ReLU, Sigmoid, Softmax
  • Optimizers: Gradient Descent, Adam
  • Regularization: L1 (Lasso) and L2 (Ridge) regularization
  • Training Features: Mini-batch training, early stopping, gradient clipping
  • Configuration: Model setup via .pkl configuration files
  • Visualization: Automatic metric plotting (loss, accuracy, precision, recall)
  • Fast Setup: Uses uv for lightning-fast dependency management

Installation & Usage

Prerequisites

  • Python 3.10+
  • uv (recommended) or pip

Quick Install

git clone https://github.com/lucas-ht/multilayer-perceptron.git
cd multilayer-perceptron
uv sync

For detailed configuration options and advanced usage, see the complete documentation.

Quick Start

1. Split your dataset

uv run mlp.py split data/data.csv --test-size 0.2

2. Train the model

uv run mlp.py train data/train.csv model.test.pkl

3. Make predictions

uv run mlp.py predict data/test.csv models/model.json

Architecture

The MLP consists of:

  • Input Layer: Accepts feature vectors from the dataset
  • Hidden Layers: Learns non-linear representations using activation functions
  • Output Layer: Produces predictions (typically with softmax for classification)

Each layer performs:

  1. Linear transformation: $z = Wx + b$
  2. Activation: $a = \sigma(z)$
  3. Backpropagation: Gradients computed via chain rule

Training Process

  1. Forward propagation: Pass inputs through layers to compute predictions
  2. Loss calculation: Measure error using cross-entropy loss
  3. Backpropagation: Compute gradients of loss w.r.t. weights
  4. Weight update: Apply optimizer (GD or Adam) to minimize loss
  5. Regularization: Optional L1/L2 penalties to prevent overfitting

Key hyperparameters (configurable in .pkl files):

  • Learning rate
  • Batch size
  • Number of epochs
  • Layer sizes and activation functions
  • Regularization strength
  • Early stopping patience

Acknowledgements

This project is part of the 42 School curriculum. Special thanks to the 42 School community for their support and resources.

About

Implementation of a multilayer perceptron for classification tasks, as part of the 42 School multilayer-perceptron project.

Topics

Resources

Stars

Watchers

Forks