Unable to create Example for spancat training purposes [E983] #11797
-
Hello, I have no idea what I'm doing wrong and I would be extremely grateful for help. spaCy v.3.4.2, python v.3.8.13 nlp = spacy.blank("pl")
spancat = nlp.add_pipe("spancat")
for label in ["nam_liv_person", "nam_loc_gpe_city", "nam_loc_gpe_country"]:
spancat.add_label(label)
docbin_train = DocBin().from_disk("./data/train.spacy")
docs_train = tuple(docbin_train.get_docs(nlp.vocab)) pprint(docs_train[0].to_json())
def get_example_from_doc(doc):
spans = list(doc.ents)
group = SpanGroup(doc, name="sc", spans=spans)
gold_ref = {"sc": group}
example = Example.from_dict(doc, gold_ref)
return example
def get_examples(docs):
return [get_example_from_doc(doc) for doc in docs] with nlp.select_pipes(enable="spancat"):
optimizer = nlp.initialize()
examples = get_examples(docs_train)
nlp.update(examples, sgd=optimizer)
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Sorry you're having trouble with this. Normally we apprecitate a lot of detail with questions, but in this case you have provided too much and it makes your question hard to follow, partly just because it's a lot to read. In this case, it would have helped if you narrowed things down to just the error you're getting and the code that caused it. Looking at the specific error you're getting, you're passing invalid values to
You do not need to use |
Beta Was this translation helpful? Give feedback.
Sorry you're having trouble with this.
Normally we apprecitate a lot of detail with questions, but in this case you have provided too much and it makes your question hard to follow, partly just because it's a lot to read. In this case, it would have helped if you narrowed things down to just the error you're getting and the code that caused it.
Looking at the specific error you're getting, you're passing invalid values to
Example.from_dict
. I'm not sure how you came up with yourget_example_from_doc
code, but it won't work - the format of data you're passing is not what that function is designed for. Instead what you should do is just set your spans on a Doc and save a DocBin. Assuming yo…