This project implements state-of-the-art computer vision models to detect and classify fish species from fishing boat camera footage.
The project features two advanced architectures located in src/models/:
-
CNN (ResNet50) (
src/models/cnn.py):- Uses a Pre-trained ResNet50 backbone.
- Custom heads for Classification (8 classes) and Bounding Box Regression.
- Robust and distinct feature extraction.
-
Transformer (ViT) (
src/models/transformer.py):- Uses a Vision Transformer (
vit_base_patch16_224) backbone. - Leverages self-attention mechanisms for global context.
- Fine-tuned for simultaneous classification and localization.
- Uses a Vision Transformer (
├── data/ # Dataset (Train images and JSON annotations)
├── app.py # Gradio Web Application
├── src/
│ ├── train.py # Main Training Script
│ ├── models/ # Neural Architectures
│ │ ├── cnn.py # ResNet50 Implementation
│ │ └── transformer.py # Vision Transformer Implementation
│ └── ...
└── requirements.txt # Dependencies
conda create -n fisheries python=3.10
conda activate fisheries
pip install -r requirements.txtTrain your preferred model architecture.
Train CNN (ResNet):
python src/train.py --model_type cnn --epochs 10 --learning_rate 1e-4Train Transformer (ViT):
python src/train.py --model_type vit --epochs 10 --learning_rate 5e-5Launch the interactive web interface to test the model.
python app.py --model_type cnn --model_path src/models/best_model_cnn.pthNote: Make sure to select the architecture that matches your trained model.
This project uses Gradio for the web interface and can be automatically deployed to Hugging Face Spaces using the included GitHub Action.
- Create a New Space on Hugging Face (e.g.,
fisheries_monitoring). - In your GitHub Repository Settings -> Secrets and variables -> Actions:
- Create a New Repository Secret named
HF_TOKENwith your Hugging Face Access Token (Write permissions).
- Create a New Repository Secret named
- Edit
.github/workflows/sync_to_hub.ymlto match your Hugging Face username and space name if they differ from the default. - Push to
main, and the action will automatically sync your code to the Space!
- Create a Space on Hugging Face.
- Upload
app.py,src/, andrequirements.txt.