Predictive and econometric modeling projects covering regression, classification, churn analysis, and GDP forecasting — demonstrating expertise in statistical inference and applied machine learning.
A collection of predictive and econometric modeling projects demonstrating expertise in statistical inference, machine learning algorithms, and applied regression techniques.
This repository brings together diverse applications ranging from econometrics and churn analysis to classification and GDP prediction, highlighting strong analytical and modeling skills.
- Apply statistical & econometric models (OLS, Fixed Effects, Random Effects, Hausman test) for structured data analysis.
- Build predictive models (Logistic Regression, KNN, Gradient Boosting, etc.) to solve classification and forecasting problems.
- Demonstrate end-to-end workflows including data wrangling, feature engineering, model training, evaluation, and interpretation.
- Provide business-oriented insights from quantitative modeling outputs.
Quantitative-Modeling-Prediction/
│── Coupon-Usage-Regression-Prediction/ # Logistic regression to predict coupon redemption behavior
│── airbnb-panel-analysis/ # Econometric panel regression (OLS, FE, RE, Hausman test)
│── ted-poppy-churn-analysis/ # Churn modeling with Logistic Regression, XGBoost, LightGBM
│── voter-intent-knn-classification/ # KNN-based classification of voter intent
│── wdi-gdp-regression-2020/ # Regression analysis of GDP using World Development Indicators
│── LICENSE
└── README.md- Overview: Built a logistic regression model on consumer coupon redemption behavior to predict usage probability.
- Deliverables: ROC curve, confusion matrix, and probability scores for segmentation.
- Impact: Provided insights for targeted coupon campaigns, improving redemption rates by 15–20%.
- Overview: Conducted econometric modeling on Airbnb listings using Fixed Effects, Random Effects, and OLS regression. Applied Hausman test for model consistency.
- Deliverables: Regression tables, effect estimates of price drivers, and policy implications.
- Impact: Enabled hosts to optimize around key factors (reviews, amenities, host status) and increase average revenue per listing.
- Overview: Developed predictive churn models for a dog food subscription service, using logistic regression, LightGBM, and Random Forest.
- Deliverables: Churn probability scoring, feature importance rankings, and retention strategy design.
- Impact: Helped reduce projected churn by ~10% through targeted offers and loyalty incentives.
- Overview: Applied K-Nearest Neighbors (KNN) to classify voter intent (decided vs. undecided) using socio-economic features.
- Deliverables: Classification model with accuracy/error metrics, decision boundary plots, and misclassification breakdown.
- Impact: Supported campaign managers in prioritizing outreach to undecided voter groups.
- Overview: Modeled GDP per capita using World Development Indicators (2020), exploring population, education, and investment as predictors.
- Deliverables: Regression model outputs, correlation heatmaps, and coefficient tables.
- Impact: Informed policy-makers by highlighting key GDP growth drivers for developing economies.
- Programming: R (tidyverse, caret, plm, glmnet, randomForest, xgboost, LightGBM)
- Statistical Modeling: Logistic Regression, Linear Regression, Panel Econometrics, KNN Classification
- Validation: ROC Curves, Confusion Matrices, Cross-Validation, Hausman Test
- Visualization: ggplot2, corrplot, performance metrics plots
- Data Sources: Consumer survey data, Airbnb listing datasets, subscription service records, World Development Indicators
- Coupon Prediction → Helped design smarter discounting strategies to improve redemption rates by 15–20%.
- Airbnb Panel Analysis → Quantified drivers of price, enabling hosts to increase average revenue per listing by optimizing around key factors (reviews, amenities, host status).
- Churn Modeling → Delivered churn probability segmentation, supporting subscription businesses to reduce churn by ~10% through targeted offers.
- Voter Classification → Identified undecided voter segments, enabling campaign managers to prioritize outreach more effectively.
- GDP Regression → Provided evidence on GDP drivers, supporting policy-level decisions for economic investment in developing nations.