Skip to content

How does simple transformers handle split by space predictions for NER models? #1583

@FDSRashid

Description

@FDSRashid

I'm trying to understand how simple transformers , specifically BERT NER models, predicts its outputs from the predict function and using the split_on_space argument. It looks like it can handle word level NER using the token classification models with the split on space argument. I just dont understand how it aggregates the model output for each split, where the model output is on a token basis, which can include subwords. It looks like load_and_cache_examples encodes each word with the help of convert_examples_teatures, then the model predicts, then _convert_tokens_to_word_logits converts the tokens back to words, i think? What I would like to do is at a very high level, do a similar thing that simpletrasnformers does, which is predict the correct class on a word by word basis, using a existing token_classification model. The reason I ask here is because the model I was using was trained using simple_transformers. I just don't want all the other stuff it does, the batching, onnx, etc. Like what's the logic to tokenize the each individual split by space, then predict on that particular split?

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleThis issue has become stale

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions