π Launch Live App
A professional, interactive web dashboard for analyzing and visualizing Mediterranean seagrass distribution patterns using machine learning and spatial analysis.
This intelligence panel provides comprehensive analysis of seagrass presence detection in the Mediterranean Sea based on the methodology from:
Effrosynidis, D., Arampatzis, A., & Sylaios, G. (2018). Seagrass detection in the Mediterranean: A supervised learning approach. Ecological Informatics, 48, 158-175. DOI: 10.1016/j.ecoinf.2018.09.004
- General overview of the project and objectives
- Description of 5 Mediterranean seagrass species with images
- Dataset characteristics and key innovations
- Data distribution visualizations
- Comprehensive description of 217 environmental predictors
- Descriptive statistics (static and temporal variables)
- Correlation analysis with interactive heatmaps
- Geographic distribution maps
- Variable distribution analysis by presence/absence
- Interactive model comparison (7 algorithms)
- Performance metrics (Accuracy, Precision, Recall, F1, ROC-AUC)
- Stratified vs Spatial cross-validation comparison
- Best model: Random Forest (98.8% stratified, 88.0% spatial accuracy)
- Feature importance: Chlorophyll-Ξ± dominates (7 of top 10 features)
- Family-level detection (5 seagrass families)
- Class imbalance analysis with Macro F1 metric
- Multi-metric performance comparison
- Spatial autocorrelation impact assessment (19-64% performance drops)
- Feature importance: Water clarity (Secchi depth) dominates (7 of top 10 features)
- Major achievements summary (corrected with actual results)
- Key scientific insights and environmental drivers
- Critical methodological findings on spatial CV
- Recommendations for:
- Model improvements (hyperparameter tuning, neural networks)
- Deployment strategies (API, GIS integration)
- Research extensions (temporal dynamics, climate scenarios, remote sensing)
- Conservation implications and management priorities
- Python 3.8 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/yourusername/med_seagrass.git
cd med_seagrass/panel- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtRun the Streamlit app:
streamlit run app.pyThe app will open in your default web browser at http://localhost:8501
panel/
βββ app.py # Main application file
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ page_modules/ # Page modules
βββ __init__.py
βββ presentation.py # Presentation page
βββ variables.py # Variables & statistics page
βββ binary_classification.py # Binary classification analysis
βββ multiclass_classification.py # Multi-class classification analysis
βββ conclusions.py # Conclusions & future steps
The panel uses the preprocessed dataset located at:
../data/pres_abs_merge_def.csv
Dataset includes:
- 3,055 observations (presence and pseudo-absence)
- 217 environmental predictors
- 5 seagrass families
- 8 geographic zones (Mediterranean-wide)
- Streamlit β₯1.28.0: Interactive web application framework
- Plotly β₯5.17.0: Interactive plotting and visualizations
- Pandas β₯2.0.0: Data manipulation and analysis
- NumPy β₯1.24.0: Numerical computing
- SciPy β₯1.11.0: Statistical analysis
- Scikit-learn β₯1.3.0: Machine learning algorithms
- PyCaret β₯3.0.0: AutoML framework (for EDA notebook)
- Matplotlib & Seaborn: Additional visualization tools
- Geographic maps with presence/absence overlay
- Correlation heatmaps
- Model performance comparisons
- Radar charts for multi-metric analysis
- Feature importance bar charts
- Cross-validation strategy selection (Stratified vs Spatial)
- Performance metric selection
- Model selection for detailed comparison
- Variable selection for distribution analysis
- Download model results as CSV
- Download descriptive statistics
- All visualizations are interactive and can be saved
- Best Model: Random Forest
- Performance: 98.8% accuracy (Stratified CV), 88.0% (Spatial CV)
- ROC-AUC: 99.9% (Stratified), 95.6% (Spatial)
- Performance Drop: 10.8% accuracy decrease with spatial CV
- Top Predictor: Chlorophyll-Ξ± (7 of top 10 features are chlorophyll measures)
- Best Stratified Model: Random Forest (86.1% accuracy, 82.2% Macro F1)
- Best Spatial Model: K Neighbors (68.1% accuracy, 62.1% Macro F1)
- Performance Drop: 18.0% accuracy, 20.1% Macro F1 decrease with spatial CV
- Top Predictor: Water clarity - Secchi depth (7 of top 10 features)
- Challenge: Performance drops 19-64% across models with spatial CV
Spatial cross-validation is essential for realistic performance estimates in species distribution models. Traditional stratified CV significantly overestimates model performance due to spatial autocorrelation.
π Access the live application: https://medseagrassdet.streamlit.app/
The app is fully interactive and deployed on Streamlit Cloud for easy access worldwide. No installation required!
The application features:
- π¨ Professional dark green sidebar with enhanced navigation
- π Interactive Plotly visualizations with larger, readable fonts
- π Real-time model comparison and filtering
- πΊοΈ Geographic distribution maps
- πΎ Export capabilities for all results
- π± Responsive design for desktop and tablet viewing
- Original Research: Effrosynidis, D., Arampatzis, A., & Sylaios, G. (2018)
- Data Sources:
- CMEMS (Copernicus Marine Environment Monitoring Service)
- Original dataset: Mendeley Data
- Developer: Guillem La Casta
- Program: CSIC Momentum Programme - "Develop your Digital Talent"
- Institution: ICMAN-CSIC (Instituto de Ciencias Marinas de AndalucΓa)
If you use this work, please cite:
Effrosynidis, D., Arampatzis, A., & Sylaios, G. (2018).
Seagrass detection in the Mediterranean: A supervised learning approach.
Ecological Informatics, 48, 158-175.
https://doi.org/10.1016/j.ecoinf.2018.09.004
This project is for educational and research purposes.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions, suggestions, or collaborations:
- Open an issue on GitHub
- Visit the live application
Built with β€οΈ for Mediterranean seagrass conservation ππΏ
Protecting blue carbon ecosystems through data science and machine learning