🐞 Bug Prediction Engine for GitHub Repositories

🔍 Overview

This project predicts which files in a GitHub repository are most likely to contain bugs, using commit history and code metrics. It helps developers prioritize code reviews and testing efforts by identifying high-risk areas in large codebases.

🚀 Features

🔎 Fetches commit data and file history from any public GitHub repo
📊 Extracts features like commit frequency, churn rate, and contributor count
🧠 Trains a machine learning model to classify files as bug-prone or safe
📈 Visualizes risk scores with an interactive dashboard

🧰 Tech Stack

Python for data processing and ML
GitHub API for repository mining
scikit-learn for model training
Streamlit for the web dashboard
Radon (optional) for code complexity metrics

📂 How It Works

Data Collection: Pulls commit history and file-level changes from a GitHub repo
Feature Engineering: Calculates metrics like:
Number of commits per file
Lines added/deleted
Number of unique contributors
Time since last modification
Labeling: Uses commit messages to label files (e.g., commits with “fix”, “bug”, “issue”)
Model Training: Trains a classifier to predict bug-prone files
Visualization: Displays risk scores in a clean dashboard

📌 Use Cases

Prioritize code reviews for risky files
Identify hotspots in legacy codebases
Improve software quality with data-driven insights

🛠️ Setup Instructions

git clone https://github.com/jiya-0805/bug_detection.git

cd bug_detection

python -m venv venv

source venv/Scripts/activate

pip install -r requirements.txt

streamlit run app.py

👩‍💻 Author

Jiya, Final Year B.Tech Student @ TIET Passionate about ML, software engineering, and building tools that solve real problems.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
models		models
src		src
venv		venv
README.md		README.md
app.py		app.py
gitignore.txt		gitignore.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐞 Bug Prediction Engine for GitHub Repositories

🔍 Overview

🚀 Features

🧰 Tech Stack

📂 How It Works

📌 Use Cases

🛠️ Setup Instructions

👩‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🐞 Bug Prediction Engine for GitHub Repositories

🔍 Overview

🚀 Features

🧰 Tech Stack

📂 How It Works

📌 Use Cases

🛠️ Setup Instructions

👩‍💻 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages