nlp.pipe is freezed after changing n_process for spaCy 3.0 #10678
-
If I keep the n_process of the nlp.pipe equal to 1 such as nlp.pipe(n_process = 1), there is no issue; however, whenever I increase n_process to any numbers larger than 1 such as 2 or 3 or 12, the Pycharm IDE is stopped working. My code:
At line doc1 = list(doc), the code stops working. If n_process is 1, the code finishes in almost 4 seconds. I am using spaCy 3.2.4. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hmm, I can't reproduce this. Can you provide more info about your platform with The main reason that users report pipelines hanging with multiprocessing is in linux with Does this hang for you (it's fine on my end with v3.2.3)? import spacy
from spacy.matcher import Matcher
from datetime import datetime
nlp = spacy.blank("en")
start_time = datetime.now()
matcher = Matcher(nlp.vocab)
pattern = [{"TEXT": {"IN": ["#", "$"]}, "OP": "+"}, {"TEXT": {"REGEX": "[A-Za-z]+"}}]
matcher.add("stockHashTag", [pattern])
docs = list(nlp.pipe(["a #AAAA"] * 1000, n_process=4, batch_size=100))
matches = [matcher(subdoc) for subdoc in docs]
outside = []
for t,sub in zip(matches,docs):
inside = []
for x in t:
inside.append(sub[x[1]:x[2]])
outside.append(inside)
end_time = datetime.now()
print(end_time - start_time) |
Beta Was this translation helpful? Give feedback.
-
I'm having the same issue if n_process = 1 everything runs... but if I increase processing freezes |
Beta Was this translation helpful? Give feedback.
Hmm, I can't reproduce this. Can you provide more info about your platform with
spacy info --markdown
?The main reason that users report pipelines hanging with multiprocessing is in linux with
trf
/transformer models, which is related to an issue in pytorch.Does this hang for you (it's fine on my end with v3.2.3)?