Example from https://spacy.io/universe/project/neuralcoref doesn't work for polish #13224
-
How to reproduce the behaviourExample from https://spacy.io/universe/project/neuralcoref works with english models: import spacy
import neuralcoref
nlp = spacy.load('en')
neuralcoref.add_to_pipe(nlp)
doc1 = nlp('My sister has a dog. She loves him.')
print(doc1._.coref_clusters)
doc2 = nlp('Angela lives in Boston. She is quite happy in that city.')
for ent in doc2.ents:
print(ent._.coref_cluster) Which outputs:
However if I use either import spacy
import neuralcoref
import pl_core_news_lg
#nlp = spacy.load('en_core_web_sm')
nlp = pl_core_news_lg.load()
neuralcoref.add_to_pipe(nlp)
doc1 = nlp('Moja siostra ma psa. Ona go kocha.')
#doc1 = nlp('My sister has a dog. She loves him.')
print(doc1._.coref_clusters)
doc2 = nlp(u'Anna żyje w Krakowie. Jest szczęśliwa w tym mieście.')
#doc2 = nlp('Angela lives in Boston. She is quite happy in that city.')
for ent in doc2.ents:
print(ent._.coref_cluster) I get following output:
I was guessing it might be connected to the fact english model is
Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hi!
|
Beta Was this translation helpful? Give feedback.
-
I'm closing this thread as initial question has been already answered. |
Beta Was this translation helpful? Give feedback.
Hi!
neuralcoref
is a plugin originally developed by Huggingface, and as stated on their readme, the pretrained model only works for English, so I'm afraid this is expected behaviour. The trained coref model simply doesn't know how to understand Polish sentences, as it was only trained on English texts.