This project explores the use of Natural Language Processing (NLP) and time series forecasting to predict short-term volatility in the S&P 500 index using tweets from former U.S. President Donald J. Trump. The hypothesis is that high-impact political communication, especially around fiscal and monetary policy, can significantly move financial markets.
The pipeline combines rule-based NLP feature engineering, financial market data, and an LSTM neural network to generate realistic trading signals.
🔗 Colab Notebook: Open in Google Colab
- Problem Statement: Political tweets, particularly from Trump during his presidency, often had measurable effects on equity indices. This project builds an NLP framework to quantify and forecast those effects.
- Objective: Predict 5-day forward returns of the S&P 500 using a combination of:
- Tweet-derived features (policy keywords, named entities, POS-based verb counts)
- Market context (daily Open prices)
- Impact:
- Provides interpretable, tweet-driven trading signals
- Enhances short-term volatility forecasting
- Supports event-driven trading, risk analysis, and policy monitoring
- Keyword Flags: Detects policy-related terms (e.g., stimulus, tariffs, interest rates, COVID-19).
- POS & NER Features (via spaCy):
- Policy verbs (e.g., “cut”, “ban”, “sign”)
- Named entities (dates, money amounts, geopolitical entities)
- Source: Yahoo Finance (S&P 500 OHLC daily prices)
- Label: 5-day forward return from each day’s Open price
- LSTM-based sequence model with 5-day sliding windows
- Input: Tweet-derived features + Open prices
- Output: Predicted return (regression)
- Converts predicted returns into Buy/Sell signals
- Evaluates predictions against intraday High/Low prices to assess if trades were realistically achievable
- Dataset: Trump tweets (2016–2021) + S&P 500 daily prices (~1,100 days)
- Model Performance:
- MAE ≈ 1.5%
- RMSE ≈ 1.9%
- R² ≈ 0.41
- Trade Simulation:
- Hit Rate: ~63% (predicted price reached intraday)
- Direction Match: ~68% (predicted vs actual direction)
- Interpretability: Predictions can be linked back to specific tweet content (e.g., stimulus announcements, tariff threats).
- Clone this repo:
git clone https://github.com/epurevsuren/Forecasting-Equity-Index-Volatility-Using-NLP-on-Tweets.git cd Forecasting-Equity-Index-Volatility-Using-NLP-on-Tweets