I found that the provided model has a vocabulary size 525, however, following the preprocessing, I got a vocabulary with size 496.