Online News Popularity

Implementation of two research papers to find the best algorithm that predicts the popularity of a news article. Done for the course of "Data Mining"

Dataset - "Online News Popularity Data Set"

This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the popularity in social networks.

Research Paper 1 - "A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News"

try.ipynb

In this file an effort to replicate the paper "A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News" has been made.

Research Paper 2 - "Predicting the Popularity of Online News from Content Metadata"

preprocessing.ipynb

Data preprocessing is done in this file.

Generates two data files -

Logarithmic tranformation on the continuous attributes - data.csv
Normalization on the continuous attributes - datan.csv
Target attribute - target.csv

gbm.ipynb

Gradient Boosting Machine and Random Forest algorithms have been applied on the preprocessed file (data.csv and datan.csv). The resultant accuracies have been compared and in accordance with the research paper, GBM turned out to be the better model. Parameter tuning has been performed to get the highest accuracy of 0.7151118.

xgboostCheck.ipynb

xgboost algorithm has been used on the preprocessed file (data.csv). Parameter tuning has been done using GridSearchCV in sklearn and cv function in xgboost.

Software Requirements

python Version - 3.6
Libraries required -
- pandas
- numpy
- scikit-learn
- xgboost
- matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Paper2		Paper2
paper1		paper1
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Online News Popularity

Dataset - "Online News Popularity Data Set"

Research Paper 1 - "A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News"

try.ipynb

Research Paper 2 - "Predicting the Popularity of Online News from Content Metadata"

preprocessing.ipynb

gbm.ipynb

xgboostCheck.ipynb

Software Requirements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

krishnakartik1/onlineNewsPopularity

Folders and files

Latest commit

History

Repository files navigation

Online News Popularity

Dataset - "Online News Popularity Data Set"

Research Paper 1 - "A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News"

try.ipynb

Research Paper 2 - "Predicting the Popularity of Online News from Content Metadata"

preprocessing.ipynb

gbm.ipynb

xgboostCheck.ipynb

Software Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages