Skip to content
Discussion options

You must be logged in to vote

In general I'd recommend the py-spy profiler to get a quick sense of where all the time is going. You can attach it to a running Python process without altering any code.

My guess is that all the time will be spent in tokenization and optimization, probably optimization in particular. The Adam solver is kind of slow on problems like this, and when you have a task that otherwise runs really quickly, the slowdown becomes more apparent. It could also be the model though.

Btw an even better toolkit for bag-of-words text classification is Vowpal Wabbit. I've always wanted a spaCy integration for that, because it's really awesome, and it's the right thing to use for a lot of problems.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@koaning
Comment options

Answer selected by koaning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf / speed Performance: speed
2 participants