Skip to content

This project applies Probability and Statistics (PnS) concepts to analyze heart attack risks. It involves data cleaning, hypothesis testing, and model training using a classified dataset. Various statistical tests and visualizations are used to extract insights, enhancing predictive accuracy.

Notifications You must be signed in to change notification settings

DarainHyder/Heart_Attack_risk-analysis-and-Trainig-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Heart_Attack_risk-analysis-and-Trainig-Model

This project focuses on analyzing heart attack risks using Probability and Statistics (PnS) concepts. It includes data cleaning, statistical hypothesis testing, and predictive model training to classify individuals at risk. The dataset is preprocessed to remove inconsistencies, and various statistical tests are conducted to determine key factors affecting heart attack risks. The project also visualizes data using graphs for better insights.

Key Features ✅ Data Cleaning – Handles missing values, duplicates, and inconsistencies in the dataset. ✅ Hypothesis Testing – Applies statistical tests to validate key assumptions. ✅ Model Training – Uses classification techniques to predict heart attack risks. ✅ Data Visualization – Generates informative graphs using Matplotlib and Seaborn. ✅ Statistical Analysis – Includes point-biserial correlation, proportion Z-tests, and normal distribution analysis.

Dataset The dataset (Heart Attack.csv) consists of multiple features related to patient health. It includes age, cholesterol levels, blood pressure, and other medical indicators. The target variable is whether an individual is at high risk of a heart attack. Technologies Used 🔹 Python – Primary language for data analysis 🔹 Pandas & NumPy – Data manipulation and numerical operations 🔹 Matplotlib & Seaborn – Data visualization 🔹 Scipy & Statsmodels – Statistical testing and hypothesis validation

Project Structure 📁 Heart Attack.csv – The dataset used for analysis 📁 main.ipynb – Jupyter Notebook containing all data processing, statistical analysis, and model training steps

How to Run the Project Clone the repository git clone https://github.com/DarainHyder/HeartAttackRiskAnalysis.git cd HeartAttackRiskAnalysis Install dependencies pip install pandas numpy matplotlib seaborn scipy statsmodels Open the Jupyter Notebook jupyter notebook main.ipynb Run the cells step by step to perform data cleaning, hypothesis testing, and model training. Results & Insights 📊 The project identifies key risk factors contributing to heart attacks using statistical analysis. 📉 Visual graphs help understand patterns and trends in the dataset. 📈 The trained model provides accurate predictions for heart attack risks based on given inputs.

Future Enhancements ✅ Integrating Machine Learning models to improve predictive accuracy. ✅ Expanding dataset for better generalization. ✅ Developing a web-based dashboard for real-time risk assessment.

About

This project applies Probability and Statistics (PnS) concepts to analyze heart attack risks. It involves data cleaning, hypothesis testing, and model training using a classified dataset. Various statistical tests and visualizations are used to extract insights, enhancing predictive accuracy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published