ML Algorithm Comparison and Insight Tool

A Flask web app for automated Exploratory Data Analysis (EDA) and quick ML model comparison, with secure server-rendered previews and exports.

📋 Table of Contents

Features
Quick Start
Installation
Usage
Project Structure
API
Supported Algorithms
Configuration
Troubleshooting
Contributing
License
Changelog

🚀 Features

Automated EDA: dataset size, dtypes, missingness, duplicates, memory usage
Smart target detection and problem-type inference (binary/multiclass vs regression)
One-click training and comparison across multiple sklearn models
Feature importance visualization and downloadable model package (model + preprocessors)
CSV export of model comparison results
Secure, server-generated HTML for data preview to minimize XSS attack surface

🏃 Quick Start

macOS/Linux (zsh) — copy & paste

# 1) Clone the repository (replace with your repo URL)

git clone https://github.com/BoddapuLokesh/ML-Algorithm-Comparison.git
cd ML-Algorithm-Comparison

# 2) Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate

# 3) Install dependencies
python -m pip install --upgrade pip
pip install -r requirements.txt

# 4) Run the application
python app.py
# App will start at http://127.0.0.1:5002/

If you already have the folder locally, start from step 2 inside the project directory.

Windows (PowerShell)

# 1) Clone the repository (replace with your repo URL)
git clone https://github.com/BoddapuLokesh/ML-Algorithm-Comparison.git
cd ML-Algorithm-Comparison

# 2) Create and activate a virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1

# 3) Install dependencies
python -m pip install --upgrade pip
pip install -r requirements.txt

# 4) Run the application
python app.py
# App will start at http://127.0.0.1:5002/

📦 Installation

Prerequisites:

Python 3.8+
pip

Steps:

Clone the repository
Create and activate a virtual environment
Install dependencies: pip install -r requirements.txt
Run the app: python app.py

Optional verification:

python -c "import flask, pandas, sklearn; print('Deps OK')"

💡 Usage

Upload a CSV/XLSX/XLS (max 50MB) to see a safe, server-generated preview
Click Analyze to compute EDA (stats, missingness, correlations)
Choose the target and confirm problem type (auto-detected; you can override)
Start training to compare models; the best model is selected automatically
Review metrics and feature importance; export CSV or download the model package

🏗️ Project Structure

ML-Algorithm-Comparison/
├── app.py                # Flask routes: upload, EDA, training, exports
├── app_helpers.py        # JSON envelope, guards, upload/EDA/train handlers
├── model_utils.py        # Back-compat facade to ml_utils/*
├── ml_utils/
│   ├── config.py         # MLConfig, typed results, preview HTML
│   ├── eda.py            # Minimal+enhanced EDA
│   ├── models.py         # AutoMLComparer (fit, score, select, importance)
│   ├── preprocessing.py  # ColumnTransformer pipelines + fallback
│   └── utils.py          # JSON safety, validations, detection
├── templates/            # Jinja templates (layout/index)
├── static/               # style.css, app.js
└── requirements.txt      # Python dependencies

📚 API

File upload and EDA

POST / — Upload dataset (AJAX)
POST /process_eda — Run EDA and return stats/auto target/type
GET /eda — EDA JSON (server-side cached)
GET /data_preview — Preview JSON
GET /get_data_preview_html — Secure HTML preview

Training and results

POST /validate_training_config — Validate target/type/split
POST /train — Train, compare, and return metrics/results/importance
GET /metrics — Best model metrics
GET /best_model — Best model name + metrics
GET /model_comparison — All trained models and metrics
GET /feature_importance — Feature importance data

Utilities

POST /analyze_target — Inspect a chosen target column
POST /calculate_split — Convert split ratio to percentages
GET /download_model — ZIP: model.joblib + preprocessors.joblib + README
GET /export_results — CSV export of all models
GET /debug_session — Inspect session keys (debug)
POST /reset_session — Clear session/caches (debug)

Notes

JSON shapes vary by endpoint; on errors you’ll receive { "success": false, "error": "..." }.

🤖 Supported Algorithms

Classification: Logistic Regression, Random Forest, Gradient Boosting, SVC, Decision Tree

Regression: Linear Regression, Random Forest, Gradient Boosting, SVR, Decision Tree

Metrics

Classification: Accuracy, Precision, Recall, F1, Training Time
Regression: R², MSE, Training Time

⚙️ Configuration

Environment (optional)

export FLASK_ENV=development
export FLASK_DEBUG=1

Runtime settings

Max upload size: 50MB
Session lifetime: 1 hour
Model timeout (default): 300s per model (see MLConfig)

Customization

Adjust preprocessing or defaults in ml_utils/*.py

🔧 Troubleshooting

.xls files require xlrd==1.2.0 (installed via requirements.txt)
Very large/wide datasets: correlations are capped to reduce memory use
If a model hits the time budget, it’s skipped; consider sampling or simpler models

🤝 Contributing

Small PRs are welcome. Please open an issue first if the change is substantial.

📄 License

MIT

📝 Changelog

August 2025

Server-side preview HTML and consolidated Python validations
AutoMLComparer pipelines and improved EDA (memory usage, quality score)
Model export (model + preprocessors) and CSV results export

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Algorithm Comparison and Insight Tool

📋 Table of Contents

🚀 Features

🏃 Quick Start

macOS/Linux (zsh) — copy & paste

Windows (PowerShell)

📦 Installation

💡 Usage

🏗️ Project Structure

📚 API

🤖 Supported Algorithms

⚙️ Configuration

🔧 Troubleshooting

🤝 Contributing

📄 License

📝 Changelog

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
ml_utils		ml_utils
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
app_helpers.py		app_helpers.py
model_utils.py		model_utils.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ML Algorithm Comparison and Insight Tool

📋 Table of Contents

🚀 Features

🏃 Quick Start

macOS/Linux (zsh) — copy & paste

Windows (PowerShell)

📦 Installation

💡 Usage

🏗️ Project Structure

📚 API

🤖 Supported Algorithms

⚙️ Configuration

🔧 Troubleshooting

🤝 Contributing

📄 License

📝 Changelog

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages