This repository studies CNNs in two complementary ways:
- Analytical / educational view (NumPy, manual backprop) in src
- Computational / practical view (PyTorch checkpoint inference) in notebooks/predict_cifar10.ipynb
The idea is simple: understand the math deeply, then use an optimized stack for fast experimentation.
A convolutional network is a function approximation pipeline that maps an image
At a high level:
- Convolution learns local pattern detectors (edges, textures, shapes)
- Nonlinearity (ReLU) increases expressive power
- Pooling / downsampling trades spatial resolution for invariance
- Dense classifier maps learned features to logits
- Softmax + cross-entropy defines the training objective
For one layer, the core operation is:
And for classification:
where
- src: CNN components built manually (
Conv2D, pooling, flatten, dense, activations, loss, optimizer) - data: CIFAR-10 data files
- model/cifar10_cnn.pt: trained PyTorch checkpoint used for prediction
- notebooks/predict_cifar10.ipynb: inference notebook (loads checkpoint, evaluates, visualizes)
- note/NOTE.md: detailed learning notes and progress log
- Explicit forward/backward logic in NumPy
- Useful for verifying gradient flow and tensor shapes
- Best for learning internals, not for high-speed training
- GPU-accelerated training/inference
- Practical for reaching stronger CIFAR-10 results quickly
- Checkpoint currently used by notebook:
model/cifar10_cnn.pt
Note: PyTorch checkpoints (
model_state,optimizer_state, etc.) are not the same format as NumPy scratch checkpoints.
- CUDA-enabled environment validated
- Best reported test accuracy during training: 91.58%
- Open notebooks/predict_cifar10.ipynb
- Run cells from top to bottom
- The notebook will:
- load
model/cifar10_cnn.pt - evaluate accuracy
- show sample predictions with ground-truth labels
- load
- Python 3.12
- PyTorch (CUDA build)
- torchvision
- numpy
- matplotlib
Build intuition for CNN mechanics from first principles, then bridge that understanding to practical, high-performance inference workflows.
This project use CC-BY-SA 4.0 license.
