Multiprocessing with doc containing custom attributes raises serialization error #10661
-
Hello Spacy's community !! class MyAttr:
def __init__(self):
self.elements:List[MySubAttr]
class MySubAttr:
def __init__(self):
self.matches: List[Tuple[str, int, int]]]
self.matches2: List[Tuple[str, List[int]]
self.range: int
class AssignAttr:
def __init__(self, nlp):
self.nlp = nlp
def __call__(self, doc):
doc._.my_attr = MyAttr()
return doc
@spacy.language.Language.factory("assign_attr")
def create_multigram_component(
nlp: spacy.language.Language, name: str
):
"""create component multigram"""
return AssignAttr(nlp)
spacy.tokens.Doc.set_extension("my_attr", default=MyAttr())
nlp = spacy.load("fr_core_news_lg")
nlp.add_pipe("assign_attr") Everything is fine when i'm processing the pipeline but if I use Thank you ! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @ColleterVi , can you post the whole traceback for the |
Beta Was this translation helpful? Give feedback.
Hi @ColleterVi , can you post the whole traceback for the
TypeError
? I tried your sample code and can't seem to reproduce the error.Also, it seems that there's a typo in your
MySubAttr
class, you forgot to add a closing square bracket inself.matches2
.