Skip to content

Latest commit

 

History

History
175 lines (136 loc) · 6.24 KB

File metadata and controls

175 lines (136 loc) · 6.24 KB

CNN vs Fully Connected Layer Comparison Tool

This Flask application provides an interactive visualization tool to understand the key differences between Convolutional Neural Networks (CNNs) and Fully Connected (FC) layers using pre-trained models for realistic and meaningful results.

🎯 Features

1. Layer Comparison Demo (/compare)

  • Side-by-side comparison of CNN vs FC layer processing
  • Interactive image upload to analyze both architectures
  • Detailed visualizations of each layer's output
  • Educational explanations of key differences
  • Pre-trained models for realistic feature detection

2. CNN Step-by-Step Analysis (/digit)

  • Comprehensive CNN visualization including:
    • Convolution layers with meaningful feature maps
    • ReLU activation functions
    • Max pooling operations
    • Fully connected layers
    • Final predictions with actual digit recognition

🔬 Key Differences Demonstrated

Convolutional Neural Networks (CNN)

  • Spatial Structure Preservation: Maintains 2D spatial relationships
  • Local Feature Detection: Detects edges, textures, and patterns in specific regions
  • Weight Sharing: Uses fewer parameters through filter reuse
  • Translation Invariance: Robust to input translations
  • Feature Maps: Visual representation of detected features

Fully Connected Networks (FC)

  • Spatial Information Loss: Flattens 2D images into 1D vectors
  • Global Feature Combination: Connects every input to every neuron
  • More Parameters: No weight sharing, requires more parameters
  • Position Sensitivity: Sensitive to input position changes
  • Neuron Activations: Bar charts showing activation values

🚀 How to Run

  1. Install Dependencies:

    pip install flask torch torchvision pillow matplotlib numpy
  2. Run the Application:

    python app.py

    Note: On first run, the application will:

    • Download the MNIST dataset (~11MB)
    • Train both CNN and FC models (takes 2-3 minutes)
    • Save the trained models for future use
  3. Open Browser: Navigate to http://localhost:5000

📊 Model Architectures

CNN Model

Input (28x28) → Conv1 (8 filters) → ReLU → Pool → Conv2 (16 filters) → ReLU → Pool → FC1 (128) → FC2 (10)

FC Model

Input (28x28) → Flatten (784) → FC1 (512) → ReLU → FC2 (256) → ReLU → FC3 (128) → FC4 (10)

🎨 Visualizations

CNN Visualizations

  • Feature Maps: Show detected patterns at each convolution layer
  • Activation Maps: Display ReLU activations
  • Pooled Features: Reduced spatial dimensions after pooling
  • Neuron Activations: Bar charts for fully connected layers

FC Visualizations

  • Flattened Input: 1D representation of the image
  • Layer Activations: Bar charts showing neuron activations
  • ReLU Effects: Comparison of pre and post-activation values
  • Final Predictions: Probability distribution across classes

🔍 Educational Insights

  1. Spatial Information:

    • CNN preserves spatial relationships through feature maps
    • FC loses spatial structure by flattening
  2. Feature Detection:

    • CNN detects local patterns (edges, textures)
    • FC combines all input information globally
  3. Parameter Efficiency:

    • CNN uses weight sharing (fewer parameters)
    • FC requires more parameters (no sharing)
  4. Translation Invariance:

    • CNN is robust to input translations
    • FC is sensitive to position changes

📁 File Structure

ERAv4S2/
├── app.py                 # Main Flask application
├── models/               # Pre-trained model weights
│   ├── cnn_model.pth    # Trained CNN model
│   └── fc_model.pth     # Trained FC model
├── data/                # MNIST dataset (auto-downloaded)
├── templates/
│   ├── index.html       # Home page with navigation
│   ├── compare.html     # CNN vs FC comparison interface
│   └── digit.html       # CNN step-by-step analysis
├── static/
│   ├── uploads/         # User uploaded images
│   └── results/         # Generated visualizations
└── README.md           # This file

🎓 Learning Objectives

After using this tool, you should understand:

  1. How CNNs preserve spatial information through convolution operations
  2. How FC layers flatten and combine features globally
  3. The visual differences between feature maps and neuron activations
  4. Why CNNs are better suited for image processing tasks
  5. The trade-offs between parameter efficiency and spatial awareness

🔧 Technical Details

  • Framework: Flask (Python web framework)
  • Deep Learning: PyTorch
  • Image Processing: PIL (Python Imaging Library)
  • Visualization: Matplotlib
  • Dataset: MNIST (handwritten digits)
  • Training: Adam optimizer, CrossEntropyLoss
  • Frontend: HTML/CSS with responsive design

📝 Usage Tips

  1. Upload clear digit images for best results
  2. Compare different digits to see how patterns change
  3. Focus on the feature maps in CNN layers to understand pattern detection
  4. Observe the flattening process in FC layers
  5. Compare parameter counts between the two architectures
  6. Notice the meaningful predictions from pre-trained models

🚀 Performance Improvements

Pre-trained Models

  • Realistic Results: Models trained on MNIST dataset
  • Meaningful Feature Maps: Show actual edge and texture detection
  • Accurate Predictions: Can recognize handwritten digits
  • Faster Loading: Models cached after first training

Training Details

  • Dataset: MNIST (60,000 training images)
  • Epochs: 3 epochs for quick training
  • Batch Size: 64 images per batch
  • Optimizer: Adam with learning rate 0.001
  • Loss Function: CrossEntropyLoss

🔄 Model Persistence

  • First Run: Models are trained and saved to models/ directory
  • Subsequent Runs: Pre-trained models are loaded instantly
  • Model Files:
    • cnn_model.pth (~50KB)
    • fc_model.pth (~2MB)

Note: This tool now uses pre-trained models trained on the MNIST dataset, providing realistic and meaningful visualizations that demonstrate actual feature detection and digit recognition capabilities.