Skip to content

ejeej/NLP_Toxic_Comments

Repository files navigation

NLP - Predicting toxicity for Twitter comments

A study task for predicting toxicity of Twitter comments (in Russian). Includes tokenization, lemmatization, word cloud, bag of words, TF-IDF, fastText. NLP libraries for Python used: re, pymorphy2, transliterate, wordcloud, nltk, razdel, fastText.

Comments in the test data are given without labels, results were submitted to the closed competition at Kaggle. The highest achieved score (accuracy) was 0.89286 (the 4th place at the leaderboard out of 20).

About

Predicting toxicity for Twitter comments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published