Skip to content
Discussion options

You must be logged in to vote

When training a SpanCat model, so the prediction is made using Transformer embedding which spans across immediate sentences, how should I set up the IOB dataset as well as training configuration?

The most important thing for your data is that your categories are well designed and the annotations reflect the output you want consistently. The architecture used to get features from your raw text will not change that.

It sounds like you have a lot of questions about how spaCy works internally. It's good to understand what your tools are doing, but in order to answer these it's probably easiest if you try the default settings, just to have a complete functional pipeline, and then tweak it fr…

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
3 replies
@egumasa
Comment options

@polm
Comment options

@egumasa
Comment options

Answer selected by egumasa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models feat / transformer Feature: Transformer
2 participants