This Flask application provides an interactive visualization tool to understand the key differences between Convolutional Neural Networks (CNNs) and Fully Connected (FC) layers using pre-trained models for realistic and meaningful results.
- Side-by-side comparison of CNN vs FC layer processing
- Interactive image upload to analyze both architectures
- Detailed visualizations of each layer's output
- Educational explanations of key differences
- Pre-trained models for realistic feature detection
- Comprehensive CNN visualization including:
- Convolution layers with meaningful feature maps
- ReLU activation functions
- Max pooling operations
- Fully connected layers
- Final predictions with actual digit recognition
- Spatial Structure Preservation: Maintains 2D spatial relationships
- Local Feature Detection: Detects edges, textures, and patterns in specific regions
- Weight Sharing: Uses fewer parameters through filter reuse
- Translation Invariance: Robust to input translations
- Feature Maps: Visual representation of detected features
- Spatial Information Loss: Flattens 2D images into 1D vectors
- Global Feature Combination: Connects every input to every neuron
- More Parameters: No weight sharing, requires more parameters
- Position Sensitivity: Sensitive to input position changes
- Neuron Activations: Bar charts showing activation values
-
Install Dependencies:
pip install flask torch torchvision pillow matplotlib numpy
-
Run the Application:
python app.py
Note: On first run, the application will:
- Download the MNIST dataset (~11MB)
- Train both CNN and FC models (takes 2-3 minutes)
- Save the trained models for future use
-
Open Browser: Navigate to
http://localhost:5000
Input (28x28) → Conv1 (8 filters) → ReLU → Pool → Conv2 (16 filters) → ReLU → Pool → FC1 (128) → FC2 (10)
Input (28x28) → Flatten (784) → FC1 (512) → ReLU → FC2 (256) → ReLU → FC3 (128) → FC4 (10)
- Feature Maps: Show detected patterns at each convolution layer
- Activation Maps: Display ReLU activations
- Pooled Features: Reduced spatial dimensions after pooling
- Neuron Activations: Bar charts for fully connected layers
- Flattened Input: 1D representation of the image
- Layer Activations: Bar charts showing neuron activations
- ReLU Effects: Comparison of pre and post-activation values
- Final Predictions: Probability distribution across classes
-
Spatial Information:
- CNN preserves spatial relationships through feature maps
- FC loses spatial structure by flattening
-
Feature Detection:
- CNN detects local patterns (edges, textures)
- FC combines all input information globally
-
Parameter Efficiency:
- CNN uses weight sharing (fewer parameters)
- FC requires more parameters (no sharing)
-
Translation Invariance:
- CNN is robust to input translations
- FC is sensitive to position changes
ERAv4S2/
├── app.py # Main Flask application
├── models/ # Pre-trained model weights
│ ├── cnn_model.pth # Trained CNN model
│ └── fc_model.pth # Trained FC model
├── data/ # MNIST dataset (auto-downloaded)
├── templates/
│ ├── index.html # Home page with navigation
│ ├── compare.html # CNN vs FC comparison interface
│ └── digit.html # CNN step-by-step analysis
├── static/
│ ├── uploads/ # User uploaded images
│ └── results/ # Generated visualizations
└── README.md # This file
After using this tool, you should understand:
- How CNNs preserve spatial information through convolution operations
- How FC layers flatten and combine features globally
- The visual differences between feature maps and neuron activations
- Why CNNs are better suited for image processing tasks
- The trade-offs between parameter efficiency and spatial awareness
- Framework: Flask (Python web framework)
- Deep Learning: PyTorch
- Image Processing: PIL (Python Imaging Library)
- Visualization: Matplotlib
- Dataset: MNIST (handwritten digits)
- Training: Adam optimizer, CrossEntropyLoss
- Frontend: HTML/CSS with responsive design
- Upload clear digit images for best results
- Compare different digits to see how patterns change
- Focus on the feature maps in CNN layers to understand pattern detection
- Observe the flattening process in FC layers
- Compare parameter counts between the two architectures
- Notice the meaningful predictions from pre-trained models
- Realistic Results: Models trained on MNIST dataset
- Meaningful Feature Maps: Show actual edge and texture detection
- Accurate Predictions: Can recognize handwritten digits
- Faster Loading: Models cached after first training
- Dataset: MNIST (60,000 training images)
- Epochs: 3 epochs for quick training
- Batch Size: 64 images per batch
- Optimizer: Adam with learning rate 0.001
- Loss Function: CrossEntropyLoss
- First Run: Models are trained and saved to
models/directory - Subsequent Runs: Pre-trained models are loaded instantly
- Model Files:
cnn_model.pth(~50KB)fc_model.pth(~2MB)
Note: This tool now uses pre-trained models trained on the MNIST dataset, providing realistic and meaningful visualizations that demonstrate actual feature detection and digit recognition capabilities.