Skip to content

🚒 An end-to-end Machine Learning Operations (MLOps) + Flask web app project designed to predict whether a maritime container shipment will arrive On Time or Delayed. The core value of this system is to function as a proactive Early Warning System. ML + DL fusioned to built this AI-Powered Web App.

License

Notifications You must be signed in to change notification settings

ryantusi/Ship_Delay_AI_Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🚒 AI-Powered Ship Delivery Delay Prediction

Hugging Face Spaces

Project Overview

DeepPort Forecast is an end-to-end Machine Learning Operations (MLOps) project designed to predict whether a maritime container shipment will arrive On Time or Delayed (Binary Classification).

The core value of this system is to function as a proactive Early Warning System, dramatically increasing the detection rate of potential delays (Recall) to minimize associated logistics costs and improve customer trust.

🌐 Live Deployment: Test the DeepPort Forecast

The optimized model (V2) is live and deployed using Docker on Hugging Face Spaces. You can interact with the simulator instantly without any local setup.

How to Use:

  1. Click the link below to access the live web application.
  2. The form is pre-populated with a low-risk sample. Click Predict Delay Status to see the baseline output.
  3. To test the model's true capabilities, input the High-Risk Values (e.g., Vessel Age $\mathbf{25}$, Weather Risk $\mathbf{9.0}$) and watch the prediction switch to DELAYED.

πŸ‘‰ Access the Live Simulator Here

demo

Key Achievements

This repository contains the full source code for model development (V1 and V2) and the deployed Flask application (/app), demonstrating high proficiency in data science, MLOps, and full-stack integration.

Metric V1 (Initial Baseline) V2 (Optimized Production) Business Impact
Recall (Delayed) $19.4%$ $91.1%$ $4.7\times$ improvement in detecting delays.
F1 Score $0.285$ $0.666$ Highly reliable model for imbalanced data.
Final Model Decision Tree XGBoost Best performance/speed tradeoff.
Deployment Flask API with optimized $\mathbf{0.316}$ threshold.

πŸ—οΈ Project Architecture

The repository is structured around the two core functions: Model Training/Versioning and Production Deployment.


.
β”œβ”€β”€ AI_Model/                           # ML Training and Versioning History
β”‚   β”œβ”€β”€ v1/                             # Initial Baseline Models (Low Recall)
|       └── ...
β”‚   └── v2/                             # Optimized Production Models (High Recall)
|       └── ...
β”œβ”€β”€ app/                                # Flask Web Application (Simulator)
β”‚   └── ...
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md                           # This file
β”œβ”€β”€ ...
└── requirements.txt                    # Python dependencies

🧠 Model Development & Optimization (AI_Model Directory)

The AI_Model folder documents the evolution from a baseline academic project (V1) to a robust production system (V2).

V1: Baseline Failure Analysis

  • Goal: Establish MLOps framework (ML, DL, Ensemble comparison).
  • Key Insight: Low F1-Score was diagnosed as a technical flaw arising from the inappropriate use of the default $0.5$ prediction threshold on imbalanced data ($\approx 36%$ delay rate).
  • Models: Decision Tree, Random Forest, Simple NN, ResNet.

V2: Recall Optimization Success

  • Methodology: Implemented techniques focused on minimizing False Negatives (missed delays) to save costs.
  • Techniques Implemented:
    1. Threshold Calibration: Optimal threshold identified at $\mathbf{0.316}$.
    2. Oversampling: Used SMOTE during training to balance class distribution.
    3. Model Upgrade: Switched best performing model to XGBoost for its superior handling of complex data patterns.
    4. Feature Expansion: Created $>15$ interaction and aggregated risk features.
  • Result: Recall jumped from $19.4%$ to $91.1%$.

πŸš€ Deployment & Running the Simulator

The core prediction service is exposed via the Flask web application in the /app directory.

Prerequisites

  1. Python 3.x

  2. Install all required libraries using the requirements.txt file.

    pip install -r requirements.txt

Running the Web Simulator Locally

  1. Navigate to the /app directory:

    cd app
  2. Start the Flask server using Gunicorn (recommended for stable local testing):

    gunicorn app:app
  3. Access the application at http://127.0.0.1:5000 (or the address specified by gunicorn).

Simulator Inference Workflow

The front-end user experience is built around a complex MLOps workflow running behind the scenes:

  1. User Input: User submits 25 primary features.
  2. Feature Engineering: utils.py calculates 8 engineered features (e.g., port_total_congestion).
  3. Preprocessing: preprocessor_optimized.joblib applies standardization and One-Hot Encoding.
  4. Prediction: The V2 XGBoost model generates the delay probability.
  5. Classification: The result is classified as "Delayed" only if the probability $\mathbf{> 0.316}$.

πŸ’‘ Future Work & Roadmap

  1. Real-Time Data Integration: Replace simulated data inputs with live data fetched via IMO number using external APIs (AIS, Port Authorities, Maritime Weather).
  2. Full-Stack Deployment: Integrate this prediction service as the backend for my planned AI-Powered Vessel Tracker and Ship Chandlers E-commerce application.
  3. Advanced Explainability: Implement SHAP or LIME to provide per-prediction justification on why a ship is likely to be delayed.

✍️ Author & Licensing

Ryan Tusi | AI/ML Engineer + Full-Stack Developer | Software Engineering with Machine Learning Specialization | Codecademy & Harvard Certified

LinkedIn | Github | Portfolio

Licensing: This project is licensed under the Creative Commons. See the included LICENSE file for full details.

About

🚒 An end-to-end Machine Learning Operations (MLOps) + Flask web app project designed to predict whether a maritime container shipment will arrive On Time or Delayed. The core value of this system is to function as a proactive Early Warning System. ML + DL fusioned to built this AI-Powered Web App.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published