Skip to content

Commit e0ebea9

Browse files
Merge pull request #2834 from kanishkasah20/issue-2306
added movie recommender system #2306
2 parents e1430cd + dd5ca16 commit e0ebea9

File tree

6 files changed

+12164
-0
lines changed

6 files changed

+12164
-0
lines changed

Movie recommendation system/Dataset/tmdb_5000_credits.csv

Lines changed: 4804 additions & 0 deletions
Large diffs are not rendered by default.

Movie recommendation system/Dataset/tmdb_5000_movies.csv

Lines changed: 4804 additions & 0 deletions
Large diffs are not rendered by default.

Movie recommendation system/README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Movie Recommender System Project
2+
A content based movie recommender system using cosine similarity based on tmdb dataset from kaggle
3+
4+
kaggel link: https://www.kaggle.com/datasets/tmdb/tmdb-movie-metadata?select=tmdb_5000_movies.csv
5+
6+
This project aims to recommend the different movies based on given input by using machine learning techniques, specifically the cosine_similarity. The model is trained on a dataset containing various features of movie tmdb dataset, such as 'movie_id','title','overview','genres','keywords','cast','crew' and other relevant factors.
7+
8+
## Dataset
9+
10+
The dataset used for this project consists of a collection of all the hollywood movie, each with associated features and the corresponding cast and crew. The dataset is preprocessed to handle missing values, categorical variables, and feature scaling, ensuring the data is suitable for training the recommendation model.
11+
12+
Dataset link: In the dataset folder(movies.csv) and (credits.csv)
13+
## Algorithm
14+
15+
Based on Content-based filtering uses item features to recommend other items similar to what the user likes, based on their previous actions or explicit feedback....
16+
17+
Cosine Similarity: It calculates the Cosine Similarity between the two non-zero vectors. A vector is a single dimesingle-dimensional signal NumPy array. Cosine similarity is a measure of similarity, often used to measure document similarity in text analysis.
18+
19+
ast: It is Abstract Syntax Tree. It contains ast.literal_eval() function. It is used to evaluate trees of the Python abstract syntax grammar. Abstract syntax changes with each python release. Custom parsers are one use case of ast.literal_eval() function.
20+
21+
PorterStemmer: It is used to determine domain vocabularies in domain analysis. Stemming is desirable as it may reduce redundancy as most of the time the word stem and their inflected/derived words mean the same.
22+
23+
## Dependencies
24+
25+
The following dependencies are required to run the project:
26+
27+
-streamlit==1.24.1
28+
-scikit-learn==1.2.1
29+
-pandas==1.5.3
30+
-numpy==1.25.1
31+
-requests==2.31.0
32+
33+
34+
To install the required dependencies, you can use the following command:
35+
36+
```shell
37+
pip install xgboost numpy pandas scikit-learn streamlit
38+
```
39+
40+
## Usage
41+
Clone the repository:
42+
```shell
43+
git clone https://github.com/your-username/Movie-Recommender-System.git
44+
```
45+
Navigate to the project directory:
46+
```shell
47+
cd Movie-Recommender-System
48+
```
49+
Install the dependencies:
50+
```shell
51+
pip install -r requirements.txt
52+
```
53+
Run the Streamlit app:
54+
```shell
55+
streamlit run app.py
56+
```
57+
58+
Open your browser and go to http://localhost:8501/ to access the movie recommendation system app.
59+
60+
Or you can use the deployed project using the link: https://movie-recommender-system-ml-ca6n1lthfcd-kanishka.streamlit.app/
61+
62+
## Disclaimer
63+
The movie recommendation system provided by this project are based on a ml model and may not always accurately reflect the real movies secenerios. The predictions should be used for reference purposes only, and dataset of tmdb from kaggle can vary due to various factors.
64+
65+
## Author
66+
67+
https://github.com/kanishkasah20

Movie recommendation system/app.py

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
import pickle
2+
import streamlit as st
3+
import requests
4+
import pandas as pd
5+
6+
7+
8+
def fetch_poster(movie_id):
9+
url = "https://api.themoviedb.org/3/movie/{}?api_key=8265bd1679663a7ea12ac168da84d2e8&language=en-US".format(movie_id)
10+
data = requests.get(url)
11+
data = data.json()
12+
poster_path = data['poster_path']
13+
full_path = "https://image.tmdb.org/t/p/w500/" + poster_path
14+
return full_path
15+
16+
def recommend(movie):
17+
index = movies[movies['title'] == movie].index[0]
18+
distances = sorted(list(enumerate(similarity[index])), reverse=True, key=lambda x: x[1])
19+
recommended_movie_names = []
20+
recommended_movie_posters = []
21+
for i in distances[1:6]:
22+
# fetch the movie poster
23+
movie_id = movies.iloc[i[0]].movie_id
24+
recommended_movie_posters.append(fetch_poster(movie_id))
25+
recommended_movie_names.append(movies.iloc[i[0]].title)
26+
27+
return recommended_movie_names,recommended_movie_posters
28+
29+
30+
st.header('TMDB Movie Recommender System')
31+
movies = pickle.load(open('movie_list.pkl','rb'))
32+
movies = pd.DataFrame(movies)
33+
similarity = pickle.load(open('similarity.pkl','rb'))
34+
35+
movie_list = movies['title'].values
36+
selected_movie = st.selectbox(
37+
"Type or select a movie from the dropdown",
38+
movie_list
39+
)
40+
41+
if st.button('Show Recommendation'):
42+
recommended_movie_names,recommended_movie_posters = recommend(selected_movie)
43+
# for i in recommended_movie_names,recommended_movie_posters:
44+
# st.write(i)
45+
col1, col2, col3, col4, col5 = st.columns(5)
46+
with col1:
47+
st.text(recommended_movie_names[0])
48+
st.image(recommended_movie_posters[0])
49+
with col2:
50+
st.text(recommended_movie_names[1])
51+
st.image(recommended_movie_posters[1])
52+
53+
with col3:
54+
st.text(recommended_movie_names[2])
55+
st.image(recommended_movie_posters[2])
56+
with col4:
57+
st.text(recommended_movie_names[3])
58+
st.image(recommended_movie_posters[3])
59+
with col5:
60+
st.text(recommended_movie_names[4])
61+
st.image(recommended_movie_posters[4])
62+
63+

0 commit comments

Comments
 (0)