GitHub - Ansikka/EDA-chronic-data-analysis: This project performs EDA on a chronic kidney disease dataset to understand data distribution, identify missing values, detect patterns, and analyze key clinical features related to kidney health. Various statistical methods and visualizations are used to gain insights that can support early diagnosis and further predictive modeling.

Exploratory Data Analysis (EDA) – Chronic Kidney Disease Dataset

📌 Project Overview

This project focuses on performing Exploratory Data Analysis (EDA) on a Chronic Kidney Disease (CKD) dataset to understand the underlying structure of the data, identify important patterns, handle missing values, and analyze key medical attributes related to kidney health. The insights derived from this analysis can support early diagnosis and serve as a foundation for machine learning models.

🎯 Objectives

Understand the distribution of clinical features

Identify missing and inconsistent values

Analyze relationships between medical parameters

Detect trends and patterns associated with chronic kidney disease

Prepare data for further predictive modeling

🧬 Dataset Description

The dataset contains patient medical records with attributes such as:

Age

Blood Pressure

Specific Gravity

Albumin

Sugar

Blood Glucose Random

Blood Urea

Serum Creatinine

Hemoglobin

Packed Cell Volume

White Blood Cell Count

Red Blood Cell Count

Hypertension, Diabetes Mellitus, Anemia, etc.

Target variable: Chronic Kidney Disease (CKD / Not CKD)

🛠️ Technologies Used

Python

Pandas – data manipulation

NumPy – numerical operations

Matplotlib & Seaborn – data visualization

Jupyter Notebook

📊 EDA Steps Performed

Data loading and inspection

Handling missing values

Data type corrections

Statistical summary of features

Univariate analysis (histograms, count plots)

Bivariate analysis (correlation heatmaps, comparisons)

Class distribution analysis

🔍 Key Insights

Several medical attributes contain missing values that require preprocessing

Certain features like serum creatinine, hemoglobin, and blood urea show strong correlation with CKD

CKD patients exhibit noticeable differences in blood-related parameters

📁 Project Structure ├── EDA_chronic_data.ipynb ├── README.md

🚀 Future Scope

Feature engineering and selection

Building machine learning models for CKD prediction

Model evaluation and optimization

Deployment as a web-based health screening tool

🤝 Contribution

Contributions are welcome! Feel free to fork the repository, raise issues, or submit pull requests.

This project is for educational and research purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
EDA_chronic_data.ipynb		EDA_chronic_data.ipynb
README.md		README.md
kidney_disease.csv		kidney_disease.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages