Skip to content
Discussion options

You must be logged in to vote

Hey DataAndMaths,

For the en_core_web_md model we prune the the vector tables to save memory. For this usecase it might be worth trying out the en_core_web_lg pre-trained model instead. Here is the similarity list it returns for "country":

['country', 'country-', 'country\x92s', 'country`s', 'country"s', 'countryâ€', 'countrys', 'country—0,467', 'country--', 'countr', 'countryâ\x80\x99s', 'lowcountry', 'Upcountry', 'upcountry', 'countrywomen', 'countrywide', 'Lowcountry', 'thecountry', 'intercountry', 'countrywoman', 'countries-', 'nation', 'Westcountry', 'countrymen', 'countryman', 'countries', 'continent', 'countrysides', 'Kountry', 'countrified', 'nationâ\x80\x99s', 'countryCredit', 'n…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@DataAndMaths
Comment options

@kadarakos
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / vectors Feature: Word vectors and similarity
2 participants