Skip to content
Discussion options

You must be logged in to vote

Hi,

In spaCy terminology, you want to store these annotations on a Doc object. You can create a Doc by specifying the words and spaces in your text, ensuring your tokenization will correspond to the gold annotations you've created. In the constructor, you can also set ents, which would look something like this, using the IOB scheme:

    import spacy
    from spacy.tokens import Doc
    nlp = spacy.blank("en")
    words = ["Sarah", "'s", "sister", "flew", "to", "Silicon", "Valley", "via", "London", "."]
    spaces = [False, True, True, True, True, True, True, True, False, False]
    ents = ["B-PERSON", "I-PERSON", "O", "", "O", "B-LOC", "I-LOC", "O", "B-GPE", "O"]
    doc = Doc(nlp.vocab, …

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage feat / ner Feature: Named Entity Recognizer feat / training Feature: Training utils, Example, Corpus and converters feat / rel Feature: Relation Extractor
2 participants