Skip to content

AI_EDA_Assistant πŸ€–πŸ’» is an automated web application which automatically collects πŸ—ƒοΈ the data and perform almost all basic tasksβœ… for example checking duplicate values , null values , correlation between data and many other things within mili seconds πŸ•› so get ready to understand the concepts of python and implement it πŸ’» and clone it. πŸ“ˆ

Notifications You must be signed in to change notification settings

sjapanjots/AI_EDA_Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– AI EDA Assistant

A smart Exploratory Data Analysis (EDA) assistant that automates data profiling, visualization, and summarization using Python and AI-powered tools.


πŸš€ Project Overview

This project is designed to assist data scientists and analysts by automating the most common EDA tasks. With just a few lines of code, the assistant can generate data summaries, detect nulls, display feature distributions, correlations, and offer insights using natural language.


πŸ’» Tech Stack

  • Python
  • Pandas / NumPy – For data manipulation and statistical calculations
  • Matplotlib / Seaborn – For plotting and visualizations
  • Streamlit – To build an interactive web interface
  • YData Profiling – For automatic EDA reports
  • scikit-learn – For preprocessing and basic ML utilities

πŸ“„ What's Inside

  • app.py
    The Streamlit application that powers the assistant.

  • eda_tools.py
    Custom Python script for handling EDA logic and feature generation.

  • requirements.txt
    Python dependencies required to run the application:

    streamlit
    pandas
    numpy
    matplotlib
    seaborn
    ydata-profiling
    scikit-learn
    
  • sample_datasets/
    Includes example datasets you can use to test the tool.


🧠 Features

  • Upload CSV files and get:
    • Dataset overview (shape, dtypes, missing values)
    • Descriptive statistics
    • Correlation matrix and heatmap
    • Class balance checks (for classification tasks)
    • Automated report using YData Profiling
  • Simple, interactive controls using Streamlit sidebar

πŸ“¦ How to Run Locally

  1. Clone the repository:

    git clone https://github.com/sjapanjots/AI_EDA_Assistant.git
    cd AI_EDA_Assistant
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the Streamlit app:

    streamlit run app.py
  4. Go to http://localhost:8501 in your browser.


πŸ–ΌοΈ Sample Use Case

  • Upload: iris.csv
  • Output:
    • Data summary
    • Target distribution
    • Feature correlation heatmap
    • Automated profiling report (HTML)

πŸ™‹β€β™‚οΈ Author

Japanjot Singh
Data Scientist & ML Enthusiast
πŸ“¬ sjapanjots@gmail.com

About

AI_EDA_Assistant πŸ€–πŸ’» is an automated web application which automatically collects πŸ—ƒοΈ the data and perform almost all basic tasksβœ… for example checking duplicate values , null values , correlation between data and many other things within mili seconds πŸ•› so get ready to understand the concepts of python and implement it πŸ’» and clone it. πŸ“ˆ

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages