Skip to content

imTejasRajput/Yes_Bank_Closing_Price_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Introduction

This is Almabetter's capstone project on ML-Regression problem. In this project we are going to work on Yes bank’s stock price dataset. Yes Bank is a well-known bank in the Indian financial domain. Since 2018, it has been in news because of the fraud case involving Rana Kapoor. This dataset has monthly stock prices of the bank since its inception and includes closing, starting, highest and lowest stock prices of every month. The main objective is to predict the stock's closing price of the month.

##Objective

The objective of this project is to predict the stock's closing price of the month.

To Do List

####To achieve the objective of the project we need to do perform exploratory data analysis, Hypothesis testing, some data manipulation and feature engineering, data preprocessing, model implementation and several other things mentioned below.

1. Import Libraries

Main libraries to be used:

  • Pandas for data manipulation and aggregation.
  • Matplotlib and Seaborn for visualization.
  • Numpy for computationally efficient operations.
  • Scikit learn for model training, model optimization and metrics calculation.

2. Know Your Data

  • Load Dataset
  • Dataset first look
  • Rows & Columns count
  • Check for duplicate and null values

3. Understanding Your Variables

  • Variables description
  • Check Unique Values for each variable

4. Data Cleaning

Fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. Dataset can contain missing data, numerical string value, various cues. If we can clean them, It will make our analyzing process easy.

5. Exploratory Data Analysis

  • Data wrangling
  • Visualization
  • Storytelling and experimenting with charts
  • Understand the relationships between variables

6. Hypothesis Testing

Based on our EDA, we will define 3 hypothetical statements from the dataset and peform hypothesis testing to obtain final conclusion about the statements.

7. Feature Engineering & Data Pre-processing

  • Handling missing values
  • Handling Outliers
  • Categorical encoding
  • Textual data preprocessing
  • Feature manipulation and selection
  • Data transformation
  • Data scaling
  • Dimensionality reduction
  • Data splitting
  • Handling imbalanced data

8. ML Model Implementation

  • Implementation
  • Explain the ML Model used and it's performance using Evaluation metric Score Chart.
  • Cross- Validation & Hyperparameter Tuning

ML models we are going to use:

  • Linear regression
  • Ridge, Lasso and ElasticNet for Regularization
  • Random Forest Regressor
  • XGboost Regressor

9. Explain the model using SHAP model explainability tool.

SHAP(SHapley Additive exPlanations) is a model explainability tool to explain the predictions of machine learning models. It is based on the concepts of game theory and can be used to explain the predictions of any machine learning model by calculating the contribution of each feature to the prediction.

About

This is Almabetter's capstone project on ML-Regression problem. Year 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published