This project implements a Movie Recommender System using Python. It processes datasets containing movie details and metadata to generate recommendations based on genres, keywords, cast, and directors.
Movie_Recommender_System.ipynb: The main Jupyter notebook containing the code for the Movie Recommender System.
data/tmdb_5000_movies.csv: A dataset containing information about movies, including genres, keywords, revenue, and release dates.
data/tmdb_5000_credits.csv: A dataset containing information about the cast and crew for each movie.
Make sure you have the following Python libraries installed:
pip install numpy pandas matplotlib- Data Loading: Loads movie and credit datasets.
- Data Cleaning: Removes null values and drops unnecessary columns.
- Data Merging: Merges movie and credit data based on the movie title.
- Feature Extraction: Extracts relevant features like genres, keywords, cast, and directors.
- Data Transformation: Transforms JSON-like columns into lists for further use.
- Clone the repository and navigate to the project directory:
git clone https://github.com/azadsingh3/Movie-Recommender-System.git
cd Movie_Recommender_System-
Place the datasets (
tmdb_5000_movies.csvandtmdb_5000_credits.csv) inside thedatafolder. -
Run the Jupyter notebook:
jupyter notebook Movie_Recommender_System.ipynb- Run the App:
- Open the command prompt terminal and get inside virtual environment:
env\Scripts\activate- make sure you are in Movie_Recommander_System:
cd Movie_Recommander_System- Run the Streamlit app:
streamlit run app.pyThe application will be available at http://localhost:8501
- Data Importing
movies = pd.read_csv('data/tmdb_5000_movies.csv')
credits = pd.read_csv('data/tmdb_5000_credits.csv')- Merging Datasets
movies = movies.merge(credits, on='title')- Data Cleaning
movies.dropna(inplace=True)- Feature Extraction
def convert(text):
return [i['name'] for i in ast.literal_eval(text)]
movies['genres'] = movies['genres'].apply(convert)- Director Extraction
def fetch_director(text):
return [i['name'] for i in ast.literal_eval(text) if i['job'] == 'Director']
movies['crew'] = movies['crew'].apply(fetch_director)Here is an example of the processed movie data:
| movie_id | title | genres | cast | crew |
|---|---|---|---|---|
| 19995 | Avatar | [Action, Adventure,.] | [Sam Worthington, ...] | [James Cameron] |
| 285 | Pirates of the Caribbean | [Adventure, Fantasy,.] | [Johnny Depp, ...] | [Gore Verbinski] |
Feel free to fork the project, submit issues, and send pull requests!
This project is licensed under the MIT License.
- TMDb for providing the movie datasets.
- Pandas and Matplotlib for data manipulation and visualization tools.