Skip to content

Aliw7979/AI-Spring2022-CA5-p1-Feed-Forward-NN-from-Scratch

Repository files navigation

AI-Spring2022-CA5-p1-Feed-Forward-NN-from-Scratch

[Feed Forward Neural Networks] - Artificial Intelligence Course - University of Tehran - Dr. Yaghoubzade

In this project, in the first part, we will implement a Feed Forward Neural Network from scratch, using the NumPy library. In the second part, with the help of the codes of the first part, we will train some neural networks and we will examine the effect of some parameters in the learning process.

Dataset and Preprocessing

  • Inputs: pickled arrays stored under dataset/data.pkl (RGB images) and dataset/labels.pkl (class ids). The notebook expects both files to be present relative to its directory.
  • Split: train_test_split with an 80/20 split (random_state=40).
  • Resizing: every image is resized to 25x25 using OpenCV to keep the fully connected network compact.
  • Normalization: flattened pixel values are scaled from [0, 255] to [0, 1] before being wrapped into DataFrames for compatibility with the custom dataloader.

Implementation Highlights

The notebook implements the full training stack instead of relying on deep-learning frameworks:

  1. Dataloader – converts labels to one-hot vectors (via sklearn.preprocessing.OneHotEncoder), supports optional shuffling, and yields NumPy matrices in configurable batch sizes.
  2. Activation functions – Identical, ReLU, Leaky ReLU, Sigmoid, Softmax, and Tanh are coded with both value and derivative methods so they can plug directly into backpropagation.
  3. CrossEntropy loss – computes softmax cross-entropy and its gradient to be used during training and evaluation.
  4. Layer – handles affine transformations, activation calls, caches, and weight/bias updates with either uniform or normal initialization.
  5. FeedForwardNN – manages the ordered layer stack, orchestrates forward passes, training loops, accuracy calculations, and the backward sweep across layers.

Training Workflow

  1. Initialize the train and test loaders:
    INPUT_SHAPE = 25 * 25
    for batch in Dataloader(X_train, y_train, 10, batch_size=32):
        TRAINLOADER.append(batch)
  2. Build the model:
    network = FeedForwardNN(INPUT_SHAPE)
    network.add_layer(n_neurons=20, activation=Relu())
    network.add_layer(10, activation=Identical())
    network.set_training_param(loss=CrossEntropy(), lr=0.001)
  3. Fit and log metrics:
    log = network.fit(epochs=15, trainloader=TRAINLOADER, testloader=TESTLOADER)

The log dictionary captures train/test accuracy and loss per epoch, enabling comparisons between experiments.

Hyperparameter Experiments

  • Weight initialization – the notebook explains why zero initialization collapses learning (symmetry and zero gradients) and instead opts for uniform/normal sampling.
  • Learning-rate sweep – tests [0.0005, 0.0055, 0.0105, 0.0155, 0.0205] for five epochs each. The best compromise for this dataset was 0.0005; pushing it as high as 15 quickly destabilizes accuracy despite fast loss updates (see 2022-06-11-21-44-04.png for the plotted comparison).
  • Activation swaps – repeats training with Sigmoid, Tanh, and Leaky ReLU hidden layers. The discussion covers saturation/vanishing gradients for Sigmoid/Tanh and the stability benefits of Leaky ReLU.
  • Batch-size comparison – contrasts mini-batch sizes 16 and 256, showing the trade-off between noisy gradients (small batch) and sluggish, memory-heavy updates (large batch). The resulting accuracies dictionary stores train/test values for each size.

Reproducing the Notebook

  1. Install dependencies
    python3 -m venv .venv
    source .venv/bin/activate
    pip install numpy pandas matplotlib scikit-learn opencv-python
  2. Place the dataset under dataset/ so that data.pkl and labels.pkl are available.
  3. Launch Jupyter
    pip install notebook
    jupyter notebook AI-CA5-P1-TODO.ipynb
  4. Run all cells sequentially to regenerate figures, metrics, and the final experiments.

Repository Layout

  • AI-CA5-P1-TODO.ipynb – the full narrative, code, and plots.
  • dataset/ – expected location for the pickled data files (not tracked in version control).
  • 2022-06-11-21-44-04.png – captured learning-rate comparison figure referenced in the notebook.

Next Steps

  • Extend FeedForwardNN with regularization (dropout, weight decay) to reduce overfitting risk.
  • Replace the manual accuracy loop with vectorized comparisons for speed.
  • Add automated evaluation scripts (e.g., plotting utilities) so results can be reproduced without re-running the entire notebook in an interactive environment.

Feel free to open the notebook to see each experiment in context—the markdown cells narrate every stage of preprocessing, modeling, and analysis.

About

[Feed Forward Neural Networks] - Artificial Intelligence Course - University of Tehran - Dr. Yaghoubzade

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors