This project analyzes a health and sleep quality dataset to perform descriptive statistics and visualize key health metrics. The analysis focuses on understanding distributions, identifying variable types, and calculating measures of center and spread.
health-data-description.ipynb: The main Jupyter Notebook containing the code for data loading, analysis, and visualization.data/: Directory containing the dataset (sleep_health_and_lifestyle_dataset.csv).images/(or root): Generated plots from the analysis, including:physical_activity_distribution.pngdaily_steps_distribution.pngheart_rate_distribution.png
The project covers the following areas based on the dataset:
-
Data Description:
- Identification of different variable types: Continuous, Integer, Ordinal Categorical, and Nominal Categorical.
-
Physical Activity Analysis:
- Calculation of Mean, Median, and Mode for physical activity statistics.
- Investigation of distribution skewness.
-
Daily Steps Analysis:
- Computation of Standard Deviation, Maximum, Minimum, and Range.
- Evaluation of the spread of daily steps data.
-
Heart Rate Distribution:
- Visualization of heart rate distribution.
- Identification of the distribution shape and potential outliers.
- Pandas: For data manipulation and analysis.
- Matplotlib & Seaborn: For creating static, animated, and interactive visualizations.
- Ensure you have the required libraries installed (
pandas,matplotlib,seaborn). - Open
health-data-description.ipynbin Jupyter Notebook or JupyterLab. - Run the cells to reproduce the analysis and generate the plots.
The presentation files can be found in presentation. The presentation was created using Google Slides and can be accessed in the link provided in the public links file. The link file has a link to Google Slides and Microsoft Powerpoint. There is also a PDF version of the presentation in the same directory as well as a powerpoint version of the presentation.