Get true spancat labels from dev.spacy file #11143
-
Hello all! I have trained a Span Categorizer and I would like to get the scores per label. I have used prodigy for the annotation and I have transformed the dataset into train.spacy and dev.spacy format using the data-to-spacy command. I want to manually compare the predicted spans and their label with the true ones. I have already extracted the predicted spans and labels but I also need to get the true annotated spans and labels to do the comparison. from spacy.tokens import DocBin
db=DocBin().from_disk("path/dev.spacy")
nlp=spacy.blank("en")
Documents=list(db.get_docs(nlp.vocab))
for doc in Documents:
print(doc.spans) But I can not get the labels for each span. Can someone help me with that? Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @aishaki , If you want to get the labels for each Span, then you need to access its spans_key = "sc"
for doc in docs:
print(doc.text)
for span in doc.spans[spans_key]:
print(span.text, span.label_) |
Beta Was this translation helpful? Give feedback.
Hi @aishaki ,
If you want to get the labels for each Span, then you need to access its
Span.label_
attribute. You can find all other accessible attributes in the spaCy Span documentation. In your case, the code will look like this: