Recession Prediction with Yield Curve PCA and Macroeconomic Variables

This repository contains a fully reproducible empirical machine learning pipeline for predicting U.S. recessions using the Treasury yield curve, principal component analysis (PCA), macroeconomic indicators, and modern classification models. The project emphasizes time-respecting validation, robust threshold selection, and scenario-based evaluation (Global Financial Crisis vs. COVID-19).

The structure and workflow are designed to match best practices in applied ML and empirical macro-finance research.

1️⃣ Research Objective

The goal of this project is to evaluate whether information in the U.S. yield curve—summarized via PCA—and macroeconomic variables can predict NBER recessions at a fixed forecast horizon.

Key questions:

How much predictive power is contained in yield curve principal components beyond simple spreads?
Does combining yield curve information with macroeconomic variables improve performance?
How stable are results across different validation schemes?
Why do models that perform well during the GFC struggle during COVID?

2️⃣ Methodological Overview

Target Variable

Binary indicator of an NBER recession at horizon $t + h$
Default horizon: 12 months ahead

Feature Blocks

Yield curve levels (3m to 30y)
Yield spreads (10y–3m, 10y–2y)
Yield curve PCA (level, slope, curvature)
Macroeconomic variables
- Unemployment rate
- Inflation (YoY)
- Industrial production (YoY)
- Consumer sentiment
- Payroll employment (YoY)
Policy / credit indicators
Regime dummies (GFC, COVID, ZLB/QE)

Models

Ridge and Elastic Net
Logistic regression (elastic net)
Random forest
Gradient boosting
XGBoost (optional)

Validation Strategy

Expanding-window cross-validation (for threshold tuning)
Multiple holdout splits (sensitivity analysis)
Scenario-based testing:
- GFC: 2007–2009
- COVID: 2019–2021

3️⃣ Repository Structure

.
├── data/                   # Raw and processed datasets
├── experiments/            # Experiment entry points
├── models/                 # Trained models
├── reports/                # LaTeX paper, tables, and figures
│   ├── figures/
│   ├── tables/
│   └── main.tex
├── src/                    # Core library code
│   ├── data/               # Data ingestion and construction
│   ├── features/           # PCA and feature engineering
│   ├── models/             # Model definitions
│   ├── evaluations/        # Metrics, thresholds, validation
│   ├── visualizations/     # Publication-quality figures
│   └── utils/              # Helpers (LaTeX export, misc)
├── environment.yml         # Conda environment (reproducible)
├── Makefile                # One-command replication
└── README.md

4️⃣ Reproducibility and Environment Setup

All experiments are fully reproducible using Conda.

Step 1: Create the environment

conda env create -f environment.yml
conda activate ec48e-recession

Step 2: Set FRED API key

export FREDAPI="YOUR_FRED_API_KEY"

5️⃣ Running the Full Pipeline

Run all experiments

make run

This will:

Download and process data from FRED
Construct features and PCA representations
Train all models
Run holdout sensitivity and scenario tests
Save results to outputs/

6️⃣ Building the LaTeX Report

The final paper is fully automated.

make report

This compiles:

reports/main.tex

which pulls in:

Tables generated from model outputs
Figures produced by the pipeline
Modular section files

7️⃣ License

See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recession Prediction with Yield Curve PCA and Macroeconomic Variables

1️⃣ Research Objective

2️⃣ Methodological Overview

Target Variable

Feature Blocks

Models

Validation Strategy

3️⃣ Repository Structure

4️⃣ Reproducibility and Environment Setup

Step 1: Create the environment

Step 2: Set FRED API key

5️⃣ Running the Full Pipeline

Run all experiments

6️⃣ Building the LaTeX Report

7️⃣ License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
cache/models		cache/models
data		data
experiments		experiments
notebooks		notebooks
reports		reports
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml

License

alexschied/yieldcurve_48e

Folders and files

Latest commit

History

Repository files navigation

Recession Prediction with Yield Curve PCA and Macroeconomic Variables

1️⃣ Research Objective

2️⃣ Methodological Overview

Target Variable

Feature Blocks

Models

Validation Strategy

3️⃣ Repository Structure

4️⃣ Reproducibility and Environment Setup

Step 1: Create the environment

Step 2: Set FRED API key

5️⃣ Running the Full Pipeline

Run all experiments

6️⃣ Building the LaTeX Report

7️⃣ License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages