Running Sentences One at A Time Gives A Different Result than Batching Them

Running a list of tweets in self.data through sentiment_score one at a time gives different results than batching that same data in 25 or 100 at a time through sentiment_scores_of_sents.

It's not just floating point issues either. I ran a list of 9068 tweets, and I found that the largest difference was 0.9579128974724299, and that 161 tweets in total had a different score in batch than in the single run that were greater than .5! 

Code for running them one at a time:
```
    out = []
    for tweet in self.data:
      out.append(sentiment_score(tweet))
```

Code for running the data in batches:
```
    out = []
    for batch in self.batch_data:
      out.extend(sentiment_scores_of_sents(batch))
```

Code for batching the data:
```
    temp_list = []
    for x in cls.data:
      if count >= 25:
        cls.batch_data.append(temp_list)
        temp_list = []
        count = 0
        
      temp_list.append(x)
      count += 1
      
    if len(temp_list) > 0:
      cls.batch_data.append(temp_list)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Sentences One at A Time Gives A Different Result than Batching Them #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Running Sentences One at A Time Gives A Different Result than Batching Them #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions