Skip to content
Discussion options

You must be logged in to vote

Approaching this task a span extraction problem like as a sensible thing to do. It is hard to say ahead of time how many annotated examples you'd need to reach an acceptable precision/recall, since it depends on much on the variability in the job responsibility descriptions and whether they are embedded in predictable contexts.

In general, we'd recommend you to iterate on your data and to set up things to make it easy to do so. E.g. using Prodigy can speed up annotation by pre-annotating examples using a model trained with the annotations you made so far, which also gives you an idea of how well a model does up to that point.

With ~1000 examples, it's at least possible to make a reasonabl…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Salekeen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage feat / spancat Feature: Span Categorizer
2 participants