You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository is based on the flask framework web application, with the help of Machine Learning, the application is capable to detect that the comment is spam or not.
4
+
5
+
## File Structure
6
+
7
+
📦Spam-comment-detector-with-flask
8
+
┣ 📂static
9
+
┃ ┗ 📂js
10
+
┃ ┣ 📜app.js
11
+
┃ ┣ 📜form.js
12
+
┃ ┗ 📜particles.js
13
+
┣ 📂templates
14
+
┃ ┗ 📜form.html
15
+
┣ 📜SPAM.csv
16
+
┣ 📜app.py
17
+
┗ 📜requirements.txt
18
+
19
+
## Flask
20
+
21
+
A microframework based on Werkzeug. It's extensively documented
22
+
and follows best practice patterns.
23
+
24
+
## Model
25
+
26
+
#### CountVectorizer
27
+
28
+
Convert a collection of text documents to a matrix of token counts.This implementation produces a sparse representation of the counts using
29
+
scipy.sparse.coo_matrix.
30
+
31
+
If you do not provide an a-priori dictionary and you do not use an analyzer
32
+
that does some kind of feature selection then the number of features will
33
+
be equal to the vocabulary size found by analyzing the data.
34
+
35
+
#### Train Test Split
36
+
37
+
Split arrays or matrices into random train and test subsets. Quick utility that wraps input validation and ``next(ShuffleSplit().split(X, y))`` and application to input data into a single call for splitting (and optionally subsampling) data in a oneliner.
38
+
39
+
#### Naive Bayes classifier for multinomial models
40
+
41
+
The multinomial Naive Bayes classifier is suitable for classification with
42
+
discrete features (e.g., word counts for text classification). The
43
+
multinomial distribution normally requires integer feature counts. However,
44
+
in practice, fractional counts such as tf-idf may also work.
0 commit comments