[Feed Forward Neural Networks] - Artificial Intelligence Course - University of Tehran - Dr. Yaghoubzade

In this project, in the first part, we will implement a Feed Forward Neural Network from scratch, using the NumPy library. In the second part, with the help of the codes of the first part, we will train some neural networks and we will examine the effect of some parameters in the learning process.
- Inputs: pickled arrays stored under
dataset/data.pkl(RGB images) anddataset/labels.pkl(class ids). The notebook expects both files to be present relative to its directory. - Split:
train_test_splitwith an 80/20 split (random_state=40). - Resizing: every image is resized to
25x25using OpenCV to keep the fully connected network compact. - Normalization: flattened pixel values are scaled from
[0, 255]to[0, 1]before being wrapped into DataFrames for compatibility with the custom dataloader.
The notebook implements the full training stack instead of relying on deep-learning frameworks:
Dataloader– converts labels to one-hot vectors (viasklearn.preprocessing.OneHotEncoder), supports optional shuffling, and yields NumPy matrices in configurable batch sizes.- Activation functions – Identical, ReLU, Leaky ReLU, Sigmoid, Softmax, and Tanh are coded with both value and derivative methods so they can plug directly into backpropagation.
CrossEntropyloss – computes softmax cross-entropy and its gradient to be used during training and evaluation.Layer– handles affine transformations, activation calls, caches, and weight/bias updates with either uniform or normal initialization.FeedForwardNN– manages the ordered layer stack, orchestrates forward passes, training loops, accuracy calculations, and the backward sweep across layers.
- Initialize the train and test loaders:
INPUT_SHAPE = 25 * 25 for batch in Dataloader(X_train, y_train, 10, batch_size=32): TRAINLOADER.append(batch)
- Build the model:
network = FeedForwardNN(INPUT_SHAPE) network.add_layer(n_neurons=20, activation=Relu()) network.add_layer(10, activation=Identical()) network.set_training_param(loss=CrossEntropy(), lr=0.001)
- Fit and log metrics:
log = network.fit(epochs=15, trainloader=TRAINLOADER, testloader=TESTLOADER)
The log dictionary captures train/test accuracy and loss per epoch, enabling comparisons between experiments.
- Weight initialization – the notebook explains why zero initialization collapses learning (symmetry and zero gradients) and instead opts for uniform/normal sampling.
- Learning-rate sweep – tests
[0.0005, 0.0055, 0.0105, 0.0155, 0.0205]for five epochs each. The best compromise for this dataset was0.0005; pushing it as high as15quickly destabilizes accuracy despite fast loss updates (see2022-06-11-21-44-04.pngfor the plotted comparison). - Activation swaps – repeats training with Sigmoid, Tanh, and Leaky ReLU hidden layers. The discussion covers saturation/vanishing gradients for Sigmoid/Tanh and the stability benefits of Leaky ReLU.
- Batch-size comparison – contrasts mini-batch sizes
16and256, showing the trade-off between noisy gradients (small batch) and sluggish, memory-heavy updates (large batch). The resultingaccuraciesdictionary stores train/test values for each size.
- Install dependencies
python3 -m venv .venv source .venv/bin/activate pip install numpy pandas matplotlib scikit-learn opencv-python - Place the dataset under
dataset/so thatdata.pklandlabels.pklare available. - Launch Jupyter
pip install notebook jupyter notebook AI-CA5-P1-TODO.ipynb
- Run all cells sequentially to regenerate figures, metrics, and the final experiments.
AI-CA5-P1-TODO.ipynb– the full narrative, code, and plots.dataset/– expected location for the pickled data files (not tracked in version control).2022-06-11-21-44-04.png– captured learning-rate comparison figure referenced in the notebook.
- Extend
FeedForwardNNwith regularization (dropout, weight decay) to reduce overfitting risk. - Replace the manual accuracy loop with vectorized comparisons for speed.
- Add automated evaluation scripts (e.g., plotting utilities) so results can be reproduced without re-running the entire notebook in an interactive environment.
Feel free to open the notebook to see each experiment in context—the markdown cells narrate every stage of preprocessing, modeling, and analysis.