LDA

An implementation of latent Dirichlet allocation with variational inference, hyperparameter optimization, selection of K through cross-validation. The algorithm was used to predict US recessions from monetary policy texts. My research shows the algorithm is predictive up to a quarter and accuracy can be as high as 90% or more, if we augment topic features with macroeconomic indices.

Thesis can be found here.

preprocess.py: a Parser class to pre-process text
- minutes_ngram_map.py: a dictionary of mappings from n-grams to unigrams for common economic phrases
topicmodel.py: an LDA class for mean-field variational inference
classifier.py: classes for discriminative classifier (via sklearn Logistic Regression) and generative classifier (based on LDA topic model from topicmodel.py)
evaluation.py: various utility functions and functions to compute/evaluate models based on the area-under-the-curve (AUC) and associated asymptotically normal hypothesis testing
plotter.py: functions to visualize classification performance of a model (ROC curve, confusion matrix, etc.)

The data used to produce visualizations and classification output can be found in train_df.csv and test_df.csv.

The stored latent variables can be found in the "Stored Latents" folder; these were used to obtain the exact results reported in the Jupyter notebooks.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
Extras		Extras
Stored Latents		Stored Latents
Tables		Tables
hierarchicalLDA		hierarchicalLDA
DiscriminativeClassifier.ipynb		DiscriminativeClassifier.ipynb
GenerativeClassifier.ipynb		GenerativeClassifier.ipynb
README.md		README.md
Regime-specific Output.ipynb		Regime-specific Output.ipynb
Topic Output.ipynb.zip		Topic Output.ipynb.zip
classifier.py		classifier.py
discr_nowcast_preds.csv		discr_nowcast_preds.csv
evaluation.py		evaluation.py
minutes_ngram_map.py		minutes_ngram_map.py
plotter.py		plotter.py
preprocess.py		preprocess.py
test_df.csv		test_df.csv
topicmodel.py		topicmodel.py
train_df.csv		train_df.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LDA

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LDA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages