Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #46. content_list.append(str(content)) # example -> content: hello world!
each feature could have a separate function for extraction (for code readability and stuff :D)
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
maybe better to sample a limited number of rows of each news site instead of containing full agencies but full row sampling...
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
plain neural networks are extremely vulnerable to class domination, unless using balancing techniques or losses
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
When we have an unbalanced data set, using the f1-score instead of accuracy can be a good choice for the model metric.
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #11. for i in tqdm(range(len(news_df['html']))):
When we use the pandas data frame, using .iterrows() is easier and readable than writing a for loop on range(len(df)).
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #81. if training_mode == True:
Two consecutive for loops can be written in the form of a for loop with two AND conditions
Reply via ReviewNB
| @@ -0,0 +1,2717 @@ | |||
| { | |||
There was a problem hiding this comment.
No description provided.