Skip to content

Error on multiple paragraphs with Spacy >=2.0 #381

@TimDettmers

Description

@TimDettmers
Traceback (most recent call last):
  File "test.py", line 23, in <module>
    support=[support, support2, support3]
  File "/home/tim/git/jack/jack/core/reader.py", line 84, in __call__
    batch = self.input_module(inputs)
  File "/home/tim/git/jack/jack/core/input_module.py", line 185, in __call__
    annotations = self.preprocess(qa_settings, answers=None, is_eval=True)
  File "/home/tim/git/jack/jack/readers/extractive_qa/shared.py", line 133, in preprocess
    preprocessed.append(self.preprocess_instance(q, a))
  File "/home/tim/git/jack/jack/readers/extractive_qa/shared.py", line 150, in preprocess_instance
    scores = sort_by_tfidf(' '.join(q_tokenized), [' '.join(s) for s in s_tokenized])
  File "/home/tim/git/jack/jack/util/preprocessing.py", line 188, in sort_by_tfidf
    tfidf = TfidfVectorizer(strip_accents="unicode", stop_words=spacy.en.STOP_WORDS, decode_error='replace')
AttributeError: module 'spacy' has no attribute 'en'

See the bug here: explosion/spaCy#1512

I tried to change it to space.lang.en.STOP_WORDS but that did not work either.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions