This project analyzes the relationship between GDP growth and education spending/investment using data from the World Bank. It uses Python with various libraries to demonstrate comprehensive economic analysis techniques.
| Library/Package | Purpose |
|---|---|
| pandas | Data cleaning/preprocessing |
| statsmodels | Panel regression & ARIMA forecasting |
| scikit-learn | Machine learning & modeling |
| matplotlib/seaborn | Advanced data visualization |
| scipy | Statistical analysis & optimization |
- Source: World Bank Open Data
- Time Period: 2010โ2020
- File Format: CSV
-
Clone the repository:
git clone https://github.com/yourusername/Economic_Growth_Analysis.git cd Economic_Growth_Analysis -
Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Download datasets: Place
gdp_data.csvandeducation_data.csvin thedata/raw/folder.
Execute the main script to run all analyses in sequence:
source venv/bin/activate # Activate virtual environment
python code/python/main.py-
Data Cleaning
python code/python/data_cleaning.py
Output:
data/processed/merged_data.csv -
Regression Analysis
python code/python/regression_analysis.py
Output: Various regression plots and statistics in
outputs/ -
Time Series Forecasting
python code/python/forecasting.py
Output: GDP forecasts and visualization in
outputs/ -
Advanced Visualization
python code/python/visualization.py
Output: Multiple visualization dashboards in
outputs/ -
Policy Simulation
python code/python/policy_simulation.py
Output: Policy simulation results and visualizations in
outputs/
- Positive Correlation: Countries with higher education spending generally show more stable GDP growth over time
- Regional Differences: Education investment patterns vary significantly by region and correlate with economic development
- Strengthening Relationship: The correlation between education spending and GDP has strengthened during the study period (2010-2020)
| File | Description |
|---|---|
regression_results.txt |
Panel regression statistical results |
gdp_vs_education.png |
Scatter plot of GDP vs Education with trend line |
regional_education_spending.png |
Education spending comparison by region |
gdp_trends.png |
GDP trends for top countries by region |
heatmap_analysis.png |
Heatmap analysis of top 10 countries |
interactive_dashboard.html |
Interactive dashboard for exploring data |
Economic_Growth_Analysis/
โโโ data/
โ โโโ raw/ # Raw datasets
โ โโโ processed/ # Cleaned data
โโโ code/
โ โโโ python/ # Python analysis scripts
โ โโโ data_cleaning.py # Data preprocessing
โ โโโ regression_analysis.py # Statistical analysis
โ โโโ forecasting.py # Time series forecasting
โ โโโ visualization.py # Data visualization
โ โโโ policy_simulation.py # Policy impact modeling
โ โโโ main.py # Main execution script
โโโ outputs/ # Analysis outputs
โโโ requirements.txt # Python dependencies
โโโ README.md # Project documentation
- Merges GDP and education spending data from World Bank
- Handles missing values and formats data for analysis
- Creates a clean, merged dataset for further analysis
- Performs panel regression with fixed effects
- Analyzes the relationship between GDP and education spending
- Generates statistical summaries and visualizations
- Uses ARIMA models to forecast future GDP trends
- Evaluates model performance with metrics (RMSE, MAE)
- Provides visualizations of forecasts with confidence intervals
- Creates comprehensive data dashboards
- Generates heatmaps, scatter plots, and trend analyses
- Performs regional comparisons and correlation analyses
- Models the impact of education spending on GDP growth
- Compares linear, polynomial, and logarithmic models
- Provides scenario analysis for different policy options
- Calculates optimal education spending levels
- Strong Positive Correlation: Countries investing more in education (as % of GDP) typically show higher and more stable economic growth
- Regional Patterns: Education investment varies significantly by region, with Europe and North America showing the highest average spending
- Optimal Investment Range: Analysis suggests 4-6% of GDP as an optimal education spending range for maximizing economic returns
- Time Lag Effect: Education spending impacts are not immediate but show stronger correlation with GDP growth after 2-3 years
- Regional analyses highlight differences in education spending and economic outcomes
| Technology | Role |
|---|---|
| Python | Core programming language for all analysis components |
| pandas | Data manipulation, cleaning, and transformation |
| numpy | Numerical computations and array operations |
| matplotlib/seaborn | Static data visualization and plotting |
| plotly/dash | Interactive visualizations and web dashboard |
| statsmodels | Statistical modeling and time series analysis |
| linearmodels | Panel data regression analysis with fixed effects |
| scikit-learn | Machine learning algorithms and model evaluation |
| rich | Enhanced terminal output and progress tracking |
Economic policymakers and researchers needed to understand the relationship between education spending and GDP growth across different countries and regions. The original analysis was fragmented across multiple programming languages (STATA, EViews, R, MATLAB), making it difficult to maintain, reproduce, and extend the research.
Develop a unified, Python-based analytical pipeline to investigate the impact of education spending on economic growth using World Bank data from 2010-2020, while providing robust statistical analysis, forecasting, and policy simulations.
- Consolidated multiple language components into a single Python codebase
- Implemented panel data regression analysis with fixed effects to account for country and time variations
- Developed time series forecasting models using ARIMA to project future GDP growth
- Created policy simulation tools to evaluate different education spending scenarios
- Built interactive visualizations for exploring relationships in the data
- Designed a comprehensive dashboard for presenting results to stakeholders
- Enhanced user experience with command-line options and progress tracking
- Statistical Findings: Identified a significant positive relationship between education spending and GDP growth (p < 0.05)
- Regional Insights: Discovered varying effects across different regions, with developing economies showing stronger correlations
- Forecasting Accuracy: Achieved RMSE of 0.05 in GDP growth predictions using optimized ARIMA models
- Policy Implications: Simulations suggest a 1% increase in education spending could yield 0.3-0.5% GDP growth in developing economies
- Technical Achievement: Successfully unified a multi-language workflow into a single, maintainable Python codebase
- User Experience: Reduced analysis execution time by 60% and provided interactive exploration capabilities
- Incorporate additional economic indicators (inflation, unemployment, etc.)
- Implement advanced machine learning models for more sophisticated predictions
- Extend the analysis to include more recent data (post-2020)
- Deploy the dashboard as a web application for wider accessibility
- Add automated data refreshing from World Bank APIs
- Policy Planning: Government agencies can use the analysis to optimize education budget allocation
- Academic Research: Economists can explore the education-growth relationship across different contexts
- Investment Strategy: Financial analysts can incorporate education spending trends into country growth forecasts
- Development Programs: International organizations can target education initiatives for maximum economic impact
- Educational Planning: Education ministries can justify budget requests with quantified economic benefits
- World Bank Open Data: https://data.worldbank.org/
- Panel Data Analysis: https://en.wikipedia.org/wiki/Panel_data
- ARIMA Time Series Forecasting: https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average
- Economic Growth Theory: https://en.wikipedia.org/wiki/Economic_growth
- Fixed Effects Models: https://en.wikipedia.org/wiki/Fixed_effects_model