Skip to content

Ansikka/EDA-chronic-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Exploratory Data Analysis (EDA) – Chronic Kidney Disease Dataset

πŸ“Œ Project Overview

This project focuses on performing Exploratory Data Analysis (EDA) on a Chronic Kidney Disease (CKD) dataset to understand the underlying structure of the data, identify important patterns, handle missing values, and analyze key medical attributes related to kidney health. The insights derived from this analysis can support early diagnosis and serve as a foundation for machine learning models.

🎯 Objectives

Understand the distribution of clinical features

Identify missing and inconsistent values

Analyze relationships between medical parameters

Detect trends and patterns associated with chronic kidney disease

Prepare data for further predictive modeling

🧬 Dataset Description

The dataset contains patient medical records with attributes such as:

Age

Blood Pressure

Specific Gravity

Albumin

Sugar

Blood Glucose Random

Blood Urea

Serum Creatinine

Hemoglobin

Packed Cell Volume

White Blood Cell Count

Red Blood Cell Count

Hypertension, Diabetes Mellitus, Anemia, etc.

Target variable: Chronic Kidney Disease (CKD / Not CKD)

πŸ› οΈ Technologies Used

Python

Pandas – data manipulation

NumPy – numerical operations

Matplotlib & Seaborn – data visualization

Jupyter Notebook

πŸ“Š EDA Steps Performed

Data loading and inspection

Handling missing values

Data type corrections

Statistical summary of features

Univariate analysis (histograms, count plots)

Bivariate analysis (correlation heatmaps, comparisons)

Class distribution analysis

πŸ” Key Insights

Several medical attributes contain missing values that require preprocessing

Certain features like serum creatinine, hemoglobin, and blood urea show strong correlation with CKD

CKD patients exhibit noticeable differences in blood-related parameters

πŸ“ Project Structure β”œβ”€β”€ EDA_chronic_data.ipynb β”œβ”€β”€ README.md

πŸš€ Future Scope

Feature engineering and selection

Building machine learning models for CKD prediction

Model evaluation and optimization

Deployment as a web-based health screening tool

🀝 Contribution

Contributions are welcome! Feel free to fork the repository, raise issues, or submit pull requests.

This project is for educational and research purposes.

About

This project performs EDA on a chronic kidney disease dataset to understand data distribution, identify missing values, detect patterns, and analyze key clinical features related to kidney health. Various statistical methods and visualizations are used to gain insights that can support early diagnosis and further predictive modeling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors