(World Happiness – Data Analysis & Visualization Project)
This project analyzes global happiness indicators using the World Happiness Report and complementary Gallup survey data to uncover social, economic, and behavioral factors influencing well-being across countries.
The workflow covers the full data lifecycle:
- Data exploration and transformation using Python and Jupyter Notebooks
- Integration of multiple datasets beyond the original report
- Analytical modeling and visualization using Power BI dashboards
Special focus was placed on methodological rigor, preserving original data structures, and contextualizing subjective metrics such as life satisfaction, generosity, and social support.
This project simulates a real-world business intelligence workflow combining ETL, analytics, and visualization.
- Explore and understand global happiness metrics
- Clean and transform multi-source datasets using Python
- Apply EDA techniques to uncover trends and patterns
- Build interactive dashboards for data-driven storytelling
- Communicate insights clearly and visually
- Python (Pandas, NumPy)
- Jupyter Notebooks
- Power BI
- CSV & external datasets
- Git & GitHub
├── docs/ # Technical documentation
├── files/ # Datasets (raw and processed)
├── scr/ # Python helper scripts
├── EDA.ipynb # Exploratory data analysis
├── transformación-*.ipynb # Data cleaning and ETL
├── whr-power-bi-dashboard.pbix
├── whr-power-bi-dashboard.pdf
├── README.md
└── .gitignore
- Initial inspection of World Happiness data
- Distribution analysis and missing value assessment
- Preliminary insights
Notebook: EDA.ipynb
- Cleaning and normalization
- Handling missing values (with and without imputation)
- Merging complementary datasets
Notebooks:
transformación-gallup-info-2005-2025.ipynbtransformacion-v2-sin-imputar.ipynb
- Interactive visualizations in Power BI
- Country comparisons
- Trend analysis over time
- Key happiness indicators
Files:
whr-power-bi-dashboard.pbixwhr-power-bi-dashboard.pdf
-
Original missing values were intentionally preserved (no imputation) to maintain data integrity and reflect the true structure of the World Happiness Report.
-
Raw numerical indicators were transformed into more interpretable formats (percentages and categorical variables) to improve visualization clarity.
-
External documentation and Gallup survey sources were incorporated to better understand how happiness metrics were collected and defined.
-
Special attention was given to key subjective variables such as:
- Life Ladder (self-reported life satisfaction)
- Generosity
- Social support
-
Survey methodology differences across countries were analyzed to contextualize potential bias in results.
-
Project coordination followed an agile sprint-based workflow, with Claudia Cervantes serving as Scrum Master.
-
Happiness outcomes are influenced not only by economic factors but strongly by social support, perceived freedom, generosity, and life satisfaction (Life Ladder score).
-
Quantitative happiness indicators attempt to measure highly subjective concepts, requiring careful interpretation beyond raw numerical values.
-
Converting complex numerical indicators into percentages and binary categories (yes/no) improved clarity and interpretability in visual analysis.
-
Data completeness varies significantly by country, reinforcing the importance of working with original (non-imputed) values to reflect real-world reporting limitations.
-
Additional sources (Gallup surveys and methodological reports) revealed differences in polling methods and sample sizes across countries, introducing potential measurement bias.
-
Some countries surveyed over 4,000 individuals while others included fewer than 500, and collection methods (in-person vs. phone surveys) varied by region.
-
These methodological differences highlight challenges in cross-country happiness comparisons and the need for contextual analysis.
This project was originally developed as a collaborative course project by:
- Claudia Cervantes — https://github.com/cloud9international
- Mayka Durán — https://github.com/Maykaduran
- Patricia Merchán — https://github.com/patrimerchan
- Andrea R. Virués — https://github.com/andreavirejos
- Ona Zaragoza — https://github.com/onaimusest
Original team repository:
https://github.com/Maykaduran/BI-WHR-beyond-the-data.git
This repository is a curated portfolio version maintained by Claudia Cervantes.
- The project follows a real-world analytics pipeline (EDA → ETL → Visualization)
- Multiple datasets were integrated for richer insights
- The Power BI dashboard provides an interactive exploration of results
- This work simulates a business intelligence workflow for global indicators analysis