Skip to content

mumairrr/Netflix_Content_Type_Classifier_Umeir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

🎬 Netflix Content Type Classifier (EDA + ML) — by Umeir

This project explores and analyzes Netflix's titles dataset to uncover content trends through data cleaning, visual insights, and machine learning.
In the final stage, a model predicts whether a title is a Movie or TV Show based on specific features.


📂 Table of Contents

  1. Introduction
  2. Data Cleaning
  3. Exploratory Data Analysis (EDA)
  4. Insights and Visualizations
  5. Machine Learning - Predicting Movie vs TV Show
  6. Dataset
  7. Tools & Libraries
  8. How to Run
  9. Author

1. Introduction

Netflix offers a massive collection of titles ranging from TV Shows to Movies.
The goal of this project is to:

  • Understand the structure and trends in Netflix's content library
  • Build a predictive model that classifies a title as a Movie or TV Show based on its duration, rating, and release year.

2. Data Cleaning

  • Removed null and duplicate values
  • Transformed duration column to numeric
  • Encoded categorical columns (type, rating) for modeling

3. Exploratory Data Analysis (EDA)

  • Analyzed distribution of Movies vs TV Shows
  • Examined content release trends over time
  • Investigated duration patterns and rating types

4. Insights and Visualizations

  • Visual graphs to uncover patterns in ratings, content duration, and content type
  • Insights into trends in different regions and over different years
  • Found correlations between release year, content duration, and type

5. Machine Learning - Predicting Movie vs TV Show

A logistic regression model was trained on the following features:

  • duration_num
  • rating_encoded
  • release_year

Steps:

  • Applied train/test split (80% training / 20% testing)
  • Trained a Logistic Regression classifier
  • Evaluated using Accuracy, Classification Report, and Confusion Matrix

Model Accuracy: 99.8%


Dataset


Tools & Libraries

  • Python 3
  • Pandas, NumPy
  • Matplotlib, Seaborn
  • Scikit-learn
  • Jupyter Notebook

🛠️ How to Run

  1. Clone the repository: git clone https://github.com/mumairrr/Netflix_Content_Type_Classifier_Umeir.git cd Netflix_Content_Type_Classifier_Umeir

  2. Install required libraries: pip install -r requirements.txt

  3. Launch Jupyter Notebook: jupyter notebook

Then open: 📄 Netflix_Content_Type_Classifier_Umeir.ipynb

📊 Dataset Netflix dataset from Kaggle - Netflix Shows

📄 Requirements All dependencies are listed in requirements.txt:

--

🙋‍♂️ Author

Umeir Mohamed
Master’s Student in Data Science – Milano Bicocca University
LinkedIn | GitHub

About

Exploratory Data Analysis on Netflix Dataset using Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors