script hangs on the removeWords step

The script seems to get hung on the line:

```
corpus <- tm_map(corpus, removeWords, stopwords('english'))
```

I'm using the largest Digital Ocean droplet possible, and when I get to this point R pegs loadavg at 1, so it appears to be leveraging all 20 cores, however, even running against just two days worth of logs this step never completes.

If I skip that step in the script, I get word clouds like:  http://i.imgur.com/WSo1VJ2.png

Has anyone else run into this?

```
> library(jsonlite)
> library(tm)
> library(wordcloud)
Loading required package: RColorBrewer
> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] wordcloud_2.5      RColorBrewer_1.1-2 tm_0.5             jsonlite_0.9.19

loaded via a namespace (and not attached):
[1] Rcpp_0.11.0 slam_0.1-34
```

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

script hangs on the removeWords step #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

script hangs on the removeWords step #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions