Skip to content

Vishwajeetdeulkar/Help-Boost-Our-Online-Reach

Repository files navigation

Help-Boost-Our-Online-Reach

The goal of this project is to, to predict which web pages would attract high, long lasting traffic with dataset having 25 features.

Dataset contains split train (5916 rows x 27 columns) and test (1479 rows x 26 columns)

The problem has been solved using classification models ( logistic regression and Multi layer Perceptron classifier

Workflow : 1) Data Preprocessing on train test data by merging ( Dealing With Missing Values, One Hot Encoding, outlier detction ) 2) NLP preprocessing on text data columns(lowering case, punctuation marks removal, hyperlink removal, stripping, stop words removal, stemming, tokenization) 3) using BOW and TF IDF for vectorizing text data in one solution 4) using GloveVec for vectorizing text data in other solution 5)Train test split 6) Applying model and predicting output probabilities 7) Stacking of models.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published