Skip to content
Discussion options

You must be logged in to vote

Hey,

Thanks for the question! Let me focus on the example data first:

data = [
    ("AGS", {"entities": [(0, 3, "CUST")]}),
    ("YML SERVICOS LTD", {"entities": [(0, 16, "CUST")]}),
    ("BORG GROUP", {"entities": [(0, 10, "CUST")]}),
    ("GRABCRANEX", {"entities": [(0, 10, "CUST")]}),
    ("GREEN SHIP", {"entities": [(0, 10, "CUST")]}),
]

If I understand correctly the texts here are the names of entities and not documents. When training a named entity recognizer the goal most often is to find the names of entities within texts. As such the methodology is to annotate the texts with entities and not to only expose the model to the entities themselves.

What you have here is a list of enti…

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@kadarakos
Comment options

@Pravin770
Comment options

@kadarakos
Comment options

@Pravin770
Comment options

@svlandeg
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / ner Feature: Named Entity Recognizer feat / training Feature: Training utils, Example, Corpus and converters
3 participants