AI-Spring2022-CA5-p1-Feed-Forward-NN-from-Scratch

[Feed Forward Neural Networks] - Artificial Intelligence Course - University of Tehran - Dr. Yaghoubzade

In this project, in the first part, we will implement a Feed Forward Neural Network from scratch, using the NumPy library. In the second part, with the help of the codes of the first part, we will train some neural networks and we will examine the effect of some parameters in the learning process.

Dataset and Preprocessing

Inputs: pickled arrays stored under dataset/data.pkl (RGB images) and dataset/labels.pkl (class ids). The notebook expects both files to be present relative to its directory.
Split: train_test_split with an 80/20 split (random_state=40).
Resizing: every image is resized to 25x25 using OpenCV to keep the fully connected network compact.
Normalization: flattened pixel values are scaled from [0, 255] to [0, 1] before being wrapped into DataFrames for compatibility with the custom dataloader.

Implementation Highlights

The notebook implements the full training stack instead of relying on deep-learning frameworks:

Dataloader – converts labels to one-hot vectors (via sklearn.preprocessing.OneHotEncoder), supports optional shuffling, and yields NumPy matrices in configurable batch sizes.
Activation functions – Identical, ReLU, Leaky ReLU, Sigmoid, Softmax, and Tanh are coded with both value and derivative methods so they can plug directly into backpropagation.
CrossEntropy loss – computes softmax cross-entropy and its gradient to be used during training and evaluation.
Layer – handles affine transformations, activation calls, caches, and weight/bias updates with either uniform or normal initialization.
FeedForwardNN – manages the ordered layer stack, orchestrates forward passes, training loops, accuracy calculations, and the backward sweep across layers.

Training Workflow

Initialize the train and test loaders:

INPUT_SHAPE = 25 * 25
for batch in Dataloader(X_train, y_train, 10, batch_size=32):
    TRAINLOADER.append(batch)

Build the model:

network = FeedForwardNN(INPUT_SHAPE)
network.add_layer(n_neurons=20, activation=Relu())
network.add_layer(10, activation=Identical())
network.set_training_param(loss=CrossEntropy(), lr=0.001)

Fit and log metrics:

log = network.fit(epochs=15, trainloader=TRAINLOADER, testloader=TESTLOADER)

The log dictionary captures train/test accuracy and loss per epoch, enabling comparisons between experiments.

Hyperparameter Experiments

Weight initialization – the notebook explains why zero initialization collapses learning (symmetry and zero gradients) and instead opts for uniform/normal sampling.
Learning-rate sweep – tests [0.0005, 0.0055, 0.0105, 0.0155, 0.0205] for five epochs each. The best compromise for this dataset was 0.0005; pushing it as high as 15 quickly destabilizes accuracy despite fast loss updates (see 2022-06-11-21-44-04.png for the plotted comparison).
Activation swaps – repeats training with Sigmoid, Tanh, and Leaky ReLU hidden layers. The discussion covers saturation/vanishing gradients for Sigmoid/Tanh and the stability benefits of Leaky ReLU.
Batch-size comparison – contrasts mini-batch sizes 16 and 256, showing the trade-off between noisy gradients (small batch) and sluggish, memory-heavy updates (large batch). The resulting accuracies dictionary stores train/test values for each size.

Reproducing the Notebook

Install dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install numpy pandas matplotlib scikit-learn opencv-python

Place the dataset under dataset/ so that data.pkl and labels.pkl are available.

Launch Jupyter

pip install notebook
jupyter notebook AI-CA5-P1-TODO.ipynb

Run all cells sequentially to regenerate figures, metrics, and the final experiments.

Repository Layout

AI-CA5-P1-TODO.ipynb – the full narrative, code, and plots.
dataset/ – expected location for the pickled data files (not tracked in version control).
2022-06-11-21-44-04.png – captured learning-rate comparison figure referenced in the notebook.

Next Steps

Extend FeedForwardNN with regularization (dropout, weight decay) to reduce overfitting risk.
Replace the manual accuracy loop with vectorized comparisons for speed.
Add automated evaluation scripts (e.g., plotting utilities) so results can be reproduced without re-running the entire notebook in an interactive environment.

Feel free to open the notebook to see each experiment in context—the markdown cells narrate every stage of preprocessing, modeling, and analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
2022-06-11-23-55-09.png		2022-06-11-23-55-09.png
AI-CA5-P1-Description.pdf		AI-CA5-P1-Description.pdf
AI-CA5-P1-TODO .html		AI-CA5-P1-TODO .html
AI-CA5-P1-TODO .ipynb		AI-CA5-P1-TODO .ipynb
LICENSE		LICENSE
README.md		README.md
nn.png		nn.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Spring2022-CA5-p1-Feed-Forward-NN-from-Scratch

Dataset and Preprocessing

Implementation Highlights

Training Workflow

Hyperparameter Experiments

Reproducing the Notebook

Repository Layout

Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI-Spring2022-CA5-p1-Feed-Forward-NN-from-Scratch

Dataset and Preprocessing

Implementation Highlights

Training Workflow

Hyperparameter Experiments

Reproducing the Notebook

Repository Layout

Next Steps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages