Skip to content
Discussion options

You must be logged in to vote

Hmm, I can't reproduce this. Can you provide more info about your platform with spacy info --markdown?

The main reason that users report pipelines hanging with multiprocessing is in linux with trf/transformer models, which is related to an issue in pytorch.

Does this hang for you (it's fine on my end with v3.2.3)?

import spacy
from spacy.matcher import Matcher
from datetime import datetime

nlp = spacy.blank("en")
start_time = datetime.now()
matcher = Matcher(nlp.vocab)
pattern = [{"TEXT": {"IN": ["#", "$"]}, "OP": "+"}, {"TEXT": {"REGEX": "[A-Za-z]+"}}]
matcher.add("stockHashTag", [pattern])
docs = list(nlp.pipe(["a #AAAA"] * 1000, n_process=4, batch_size=100))
matches = [matcher(subdoc) f…

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
4 replies
@binh0206
Comment options

@adrianeboyd
Comment options

@binh0206
Comment options

@adrianeboyd
Comment options

Answer selected by adrianeboyd
Comment options

You must be logged in to vote
1 reply
@danieldk
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scaling Scaling, serving and parallelizing spaCy
4 participants