Skip to content
Discussion options

You must be logged in to vote

For reference SpanCategorizer dev is in #6747.

I'm not working on that feature so I could be wrong but it sounds like it's not quite developed with your use case in mind - the typical spans are like NER spans, and any proposed span can be given a non-of-the-above category. That doesn't mean it wouldn't work, just that the features it provides might be less important to you. In particular you have no overlapping spans and every sentence must have a category label.

What you could do that would work now is just create a doc out of each sentence, and add an original_doc_id as a document attribute. That would allow you to reconstruct the original documents from the sentences, perhaps using Doc…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by polm
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage feat / textcat Feature: Text Classifier feat / spancat Feature: Span Categorizer
3 participants