Mini Inception Network — Custom CNN Architecture in TensorFlow/Keras
This project implements a compact version of the Inception architecture in TensorFlow/Keras for educational purposes. The goal is to:
- Understand and implement parallel convolutional branches.
- Learn how to design multi-path CNN modules.
- Train a miniaturized Inception-style model on CIFAR-10.
The notebook builds a Mini Inception Network from scratch:
-
Inception Module:
-
Parallel paths with:
1x1convolutions3x3convolutions5x5convolutions3x3max pooling followed by1x1convolution
-
Concatenation of feature maps along the channel axis.
-
-
Model Architecture:
- Input layer for 32×32×3 images.
- One or more Mini Inception modules.
- Global Average Pooling → Dense softmax output.
-
Training:
- Dataset: CIFAR-10 (via
tf.keras.datasets). - Normalization by dividing pixel values by 255.0.
- One-hot encoded labels with
tf.keras.utils.to_categorical. - Loss:
categorical_crossentropy - Optimizer:
adam
- Dataset: CIFAR-10 (via
-
Evaluation:
- Model compiled and trained for a limited number of epochs for demonstration.
- Python
- TensorFlow / Keras
- NumPy
- Matplotlib (for plotting training history)
-
CIFAR-10:
- 60,000 images (32×32 RGB), 10 classes.
- Training set: 50,000 images.
- Test set: 10,000 images.
- Classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.
-
Preprocessing in code:
- Normalization to
[0, 1] - One-hot encoding of labels.
- Normalization to
Prerequisites
- Python 3.x
- TensorFlow
Install (pip)
pip install tensorflow numpy matplotlibRun
-
Open
Mini Inception Network.ipynbin Jupyter/VS Code. -
Run all cells sequentially.
-
The notebook will:
- Load and preprocess CIFAR-10.
- Build the Mini Inception model.
- Train and evaluate the model.
- Optionally plot training/validation accuracy and loss.
- Not provided — This notebook focuses on demonstrating architecture design and implementation rather than achieving state-of-the-art accuracy.
- Not provided in repository — Running the notebook will generate model summary and training history plots.
- This project reinforces the concept of parallel convolutional branches from Google’s Inception networks.
- Demonstrates how to implement modular blocks that can be reused in deeper architectures.
- Shows how even simplified versions of complex networks can be valuable for learning and prototyping.
- The limited training setup is intentional to prioritize architecture demonstration over exhaustive training.
Mehran Asgari Email: [email protected] GitHub: https://github.com/imehranasgari
This project is licensed under the Apache 2.0 License – see the LICENSE file for details.
💡 Some interactive outputs (e.g., plots, widgets) may not display correctly on GitHub. If so, please view this notebook via nbviewer.org for full rendering.