avinashkranjan
diff --git a/‎Movie recommendation system/Dataset/tmdb_5000_credits.csv
Lines changed: 4804 additions & 0 deletions b/‎Movie recommendation system/Dataset/tmdb_5000_credits.csv
Lines changed: 4804 additions & 0 deletions
diff --git a/‎Movie recommendation system/Dataset/tmdb_5000_movies.csv
Lines changed: 4804 additions & 0 deletions b/‎Movie recommendation system/Dataset/tmdb_5000_movies.csv
Lines changed: 4804 additions & 0 deletions
diff --git a/‎Movie recommendation system/README.md
Lines changed: 67 additions & 0 deletions b/‎Movie recommendation system/README.md
Lines changed: 67 additions & 0 deletions
diff --git a/‎Movie recommendation system/app.py
Lines changed: 63 additions & 0 deletions b/‎Movie recommendation system/app.py
Lines changed: 63 additions & 0 deletions
@@ -0,0 +1,67 @@
+# Movie Recommender System Project
+A content based movie recommender system using cosine similarity based on tmdb dataset from kaggle
+
+kaggel link: https://www.kaggle.com/datasets/tmdb/tmdb-movie-metadata?select=tmdb_5000_movies.csv
+
+This project aims to recommend the different movies based on given input by using machine learning techniques, specifically the cosine_similarity. The model is trained on a dataset containing various features of movie tmdb dataset, such as 'movie_id','title','overview','genres','keywords','cast','crew' and other relevant factors.
+
+## Dataset
+
+The dataset used for this project consists of a collection of all the hollywood movie, each with associated features and the corresponding cast and crew. The dataset is preprocessed to handle missing values, categorical variables, and feature scaling, ensuring the data is suitable for training the recommendation model.  
+
+Dataset link: In the dataset folder(movies.csv) and (credits.csv)
+## Algorithm
+
+Based on Content-based filtering uses item features to recommend other items similar to what the user likes, based on their previous actions or explicit feedback....
+
+Cosine Similarity: It calculates the Cosine Similarity between the two non-zero vectors. A vector is a single dimesingle-dimensional signal NumPy array. Cosine similarity is a measure of similarity, often used to measure document similarity in text analysis.
+
+ast: It is Abstract Syntax Tree. It contains ast.literal_eval() function. It is used to evaluate trees of the Python abstract syntax grammar. Abstract syntax changes with each python release. Custom parsers are one use case of ast.literal_eval() function.
+
+PorterStemmer: It is used to determine domain vocabularies in domain analysis. Stemming is desirable as it may reduce redundancy as most of the time the word stem and their inflected/derived words mean the same.
+
+## Dependencies
+
+The following dependencies are required to run the project:
+
+-streamlit==1.24.1
+-scikit-learn==1.2.1
+-pandas==1.5.3
+-numpy==1.25.1
+-requests==2.31.0
+
+
+To install the required dependencies, you can use the following command:
+
+```shell
+pip install xgboost numpy pandas scikit-learn streamlit
+```
+
+## Usage
+Clone the repository:
+```shell
+git clone https://github.com/your-username/Movie-Recommender-System.git
+```
+Navigate to the project directory:
+```shell
+cd Movie-Recommender-System
+```
+Install the dependencies:
+```shell
+pip install -r requirements.txt
+```
+Run the Streamlit app:
+```shell
+streamlit run app.py
+```
+
+Open your browser and go to http://localhost:8501/ to access the movie recommendation system app.
+
+Or you can use the deployed project using the link: https://movie-recommender-system-ml-ca6n1lthfcd-kanishka.streamlit.app/
+
+## Disclaimer
+The movie recommendation system provided by this project are based on a ml model and may not always accurately reflect the real movies secenerios. The predictions should be used for reference purposes only, and dataset of tmdb from kaggle can vary due to various factors.
+
+## Author
+
+https://github.com/kanishkasah20
@@ -0,0 +1,63 @@
+import pickle
+import streamlit as st
+import requests
+import pandas as pd
+
+
+
+def fetch_poster(movie_id):
+    url = "https://api.themoviedb.org/3/movie/{}?api_key=8265bd1679663a7ea12ac168da84d2e8&language=en-US".format(movie_id)
+    data = requests.get(url)
+    data = data.json()
+    poster_path = data['poster_path']
+    full_path = "https://image.tmdb.org/t/p/w500/" + poster_path
+    return full_path
+
+def recommend(movie):
+    index = movies[movies['title'] == movie].index[0]
+    distances = sorted(list(enumerate(similarity[index])), reverse=True, key=lambda x: x[1])
+    recommended_movie_names = []
+    recommended_movie_posters = []
+    for i in distances[1:6]:
+        # fetch the movie poster
+        movie_id = movies.iloc[i[0]].movie_id
+        recommended_movie_posters.append(fetch_poster(movie_id))
+        recommended_movie_names.append(movies.iloc[i[0]].title)
+
+    return recommended_movie_names,recommended_movie_posters
+
+
+st.header('TMDB Movie Recommender System')
+movies = pickle.load(open('movie_list.pkl','rb'))
+movies = pd.DataFrame(movies)
+similarity = pickle.load(open('similarity.pkl','rb'))
+
+movie_list = movies['title'].values
+selected_movie = st.selectbox(
+    "Type or select a movie from the dropdown",
+    movie_list
+)
+
+if st.button('Show Recommendation'):
+    recommended_movie_names,recommended_movie_posters = recommend(selected_movie)
+    # for i in recommended_movie_names,recommended_movie_posters:
+    #     st.write(i)
+    col1, col2, col3, col4, col5 = st.columns(5)
+    with col1:
+        st.text(recommended_movie_names[0])
+        st.image(recommended_movie_posters[0])
+    with col2:
+        st.text(recommended_movie_names[1])
+        st.image(recommended_movie_posters[1])
+
+    with col3:
+        st.text(recommended_movie_names[2])
+        st.image(recommended_movie_posters[2])
+    with col4:
+        st.text(recommended_movie_names[3])
+        st.image(recommended_movie_posters[3])
+    with col5:
+        st.text(recommended_movie_names[4])
+        st.image(recommended_movie_posters[4])
+
+