Skip to content
Discussion options

You must be logged in to vote

Sorry you're having trouble with this. To make it easier for us to help you, please do not post screenshots of terminal output, which are hard to read, and please read the Markdown guide on formatting code blocks.

Only zero scores usually indicates a data problem. Did you try using spacy debug data to check if there were problems with your data?

Is your text data Romanized? If not then the base Transformer model of roberta-base won't work well, as none of your words will be in the vocabulary. You could try using a Hindi Transformer model (just change the name in the config) or use a non-Tranformer tok2vec (that's probably easiest).

Let us know if those don't help.

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
2 replies
@srikamalteja
Comment options

@polm
Comment options

Answer selected by srikamalteja
Comment options

You must be logged in to vote
1 reply
@polm
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models lang / hi Hindi language data and models feat / ner Feature: Named Entity Recognizer
2 participants
Converted from issue

This discussion was converted from issue #11724 on November 01, 2022 03:27.