Skip to content

Sonicof/market_predictions_LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

S&P 500 Stock Price Prediction Using LSTM

Overview

This project implements an improved Long Short-Term Memory (LSTM) model to predict the next day's closing price of the S&P 500 index. The notebook LSTM_Stock_Prediction2.ipynb fetches historical S&P 500 data, preprocesses it, trains an LSTM model, evaluates its performance, and makes a prediction for the next trading day. Key improvements include reduced model complexity, adjusted sequence length, custom learning rate, increased regularization, and early stopping to address high validation loss.

Features

  • Data Source: Historical S&P 500 data fetched using the yfinance library.
  • Preprocessing: Data scaling with MinMaxScaler and sequence creation for LSTM input (sequence length: 30 days).
  • Model: Simplified LSTM architecture with two layers (32 units each), increased dropout (0.3), and a custom learning rate (0.0005) for the Adam optimizer.
  • Training: Includes early stopping to prevent overfitting, with a validation split of 0.1.
  • Evaluation: Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) metrics for model performance.
  • Prediction: Predicts the next trading day's closing price and visualizes actual vs. predicted prices.

Prerequisites

  • Python 3.10 or higher
  • Jupyter Notebook or Google Colab environment
  • GPU support (optional, but recommended for faster training)

Required Libraries

Install the necessary Python libraries using pip:

pip install yfinance pandas numpy matplotlib scikit-learn tensorflow

Usage

  1. Clone or Download the Notebook:

    • Download LSTM_Stock_Prediction2.ipynb to your local machine or open it in Google Colab.
  2. Install Dependencies:

    • Ensure all required libraries are installed (see Prerequisites).
  3. Run the Notebook:

    • Open the notebook in Jupyter Notebook or Google Colab.
    • Execute each cell sequentially to:
      • Fetch S&P 500 historical data.
      • Preprocess the data.
      • Train the LSTM model.
      • Evaluate the model with MAE and RMSE.
      • Visualize actual vs. predicted prices.
      • Predict the next trading day's closing price.
  4. Check Outputs:

    • The notebook will display:
      • Training and validation loss plot.
      • Actual vs. predicted prices plot.
      • MAE and RMSE values.
      • The predicted closing price for the next trading day (e.g., 5699.39 for the day after June 6, 2025).

Data

  • Source: Yahoo Finance (via yfinance library).
  • Time Range: From December 30, 1927, to June 6, 2025.
  • Features Used: Closing price (Close) for prediction, with Open, High, Low, and Volume available in the dataset.
  • Note: The notebook uses the closing price for prediction, scaled to the range [0, 1].

Model Details

  • Architecture:
    • Two LSTM layers with 32 units each.
    • Dropout layers (0.3) after each LSTM layer for regularization.
    • Two dense layers (16 units and 1 unit) for output.
  • Training:
    • Sequence length: 30 days.
    • Batch size: 64.
    • Epochs: 50 (with early stopping, patience=10).
    • Optimizer: Adam with a learning rate of 0.0005.
    • Loss function: Mean Squared Error (MSE).
  • Performance:
    • MAE: 53.37 (on the test set).
    • RMSE: 86.70 (on the test set).

Results

  • Evaluation:
    • The model achieves an MAE of 53.37 and an RMSE of 86.70 on the test set, indicating the average prediction error in S&P 500 points.
    • Predictions for recent dates (e.g., June 2–6, 2025) show the model underestimates the actual closing prices by about 250–300 points.
  • Future Prediction:
    • Predicted closing price for the next trading day (after June 6, 2025): 5699.39.

Limitations

  • Rate Limiting: Yahoo Finance may impose rate limits on API requests, which can cause YFRateLimitError. Consider caching data locally or using an alternative data source like Alpha Vantage.
  • Model Performance: The model's predictions lag behind actual prices, suggesting potential improvements in feature engineering (e.g., adding technical indicators) or model architecture (e.g., using attention mechanisms).
  • Data Noise: Stock price data is noisy, and the model may struggle with sudden market shifts.

Future Improvements

  • Alternative Data Sources: Use Alpha Vantage or Financial Modeling Prep to avoid Yahoo Finance rate limits.
  • Feature Engineering: Incorporate additional features like Open, High, Low, Volume, or technical indicators (e.g., RSI, MACD).
  • Model Enhancements: Add attention mechanisms, use a GRU instead of LSTM, or experiment with different sequence lengths.
  • Hyperparameter Tuning: Perform grid search to optimize learning rate, batch size, and model architecture.

Troubleshooting

  • Rate Limit Errors:
    • Add caching to save data locally after the first fetch.
    • Switch to Alpha Vantage (requires an API key).
  • High Loss:
    • Increase regularization (e.g., dropout or L2).
    • Reduce model complexity (e.g., fewer LSTM units).
    • Add data smoothing (e.g., moving averages).
  • TensorFlow GPU Issues:
    • Ensure your GPU drivers and CUDA toolkit are correctly installed.
    • Fall back to CPU if GPU setup fails (handled automatically in the notebook).

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute the code as needed.

Contact

For questions or suggestions, please open an issue in the repository.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors