Skip to content
Discussion options

You must be logged in to vote

The ner_drugs data JSONL format is from prodigy and isn't the same as the simple training format used in the v2 examples.

Here's how it could look for v3:

import random
import spacy
from spacy.training import Example
from spacy.util import minibatch, compounding

nlp = spacy.blank("en")
nlp.add_pipe("ner")

TRAIN_DATA = [
    ("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}),
    ("I like London.", {"entities": [(7, 13, "LOC")]}),
]
examples = []
for text, annots in TRAIN_DATA:
    examples.append(Example.from_dict(nlp.make_doc(text), annots))

nlp.initialize(lambda: examples)

for i in range(20):
    random.shuffle(examples)
    for batch in minibatch(examples, size=2):
        p…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / training Feature: Training utils, Example, Corpus and converters
2 participants