This repository contains the code for a stock sentiment analysis project. It uses Natural Language Processing (NLP) techniques and Machine Learning (ML) models to predict the sentiment of stock market news headlines. The prediction can be either positive (stock price will increase) or negative (stock price will decrease).
The goal of this project is to predict whether the stock market will increase or decrease based on news headlines using machine learning algorithms.
The dataset used in this project is the Daily News for Stock Market Prediction dataset from Kaggle.
- Clone this repository:
git clone https://github.com/theamiteshtripathi/Stock-Sentiment-Analysis.git - Change into the project directory:
cd Stock-Sentiment-Analysis - Install the required dependencies:
conda env create -f environment.yml
The main script is main.py and it takes several command line arguments:
--data: The path to the data file.--headline: A headline to predict the sentiment of.--visualize: Whether to visualize the results.
To train a new model with your data, use the --data argument:
python3 main.py --data "/path/to/your/data.csv"To make a prediction on a specific headline, use the --headline argument. The model will be loaded from models/model.joblib. If this file doesn't exist, the script will try to train a new model, which requires the --data argument:
python3 main.py --headline "Your headline here"To visualize the results of the training or prediction, use the --visualize argument. This argument requires either the --data or --headline argument:
python3 main.py --data "/path/to/your/data.csv" --visualize
python3 main.py --headline "Your headline here" --visualizestockSentimentAnalysis/
|--- environment.yml
|--- README.md
|--- run.py
|--- src/
| |--- __init__.py
| |--- models/
| | |--- __init__.py
| | |--- train.py
| | |--- evaluate.py
| | |--- predict.py
| |--- preprocessing/
| | |--- __init__.py
| | |--- clean_text.py
| |--- visualization/
| | |--- __init__.py
| | |--- visualize.py
|--- data/
|--- models/
|--- config/
| |--- __init__.py
| |--- configuration.py
The final model achieved an accuracy of 82% on the test data.
This project is licensed under the terms of the MIT license.
Contributions are welcome! Please make a pull request in this repository.
For any queries, please feel free to reach out to me at [email protected].
