Removing and Adding new entity labels to en_core_web_lg #9494
-
I found that en_core_web_lg has named recognition entity labels as CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART. Out of these labels i need DATE, EVENT, ORG, PERSON, TIME, WORK_OF_ART. Other entity labels from the above are not needed for my application. Also i need few other entity labels to be included to this. Is it possible to remove few of the existing labels and add few other new labels to this list. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, @aniyyanz08
You can safely ignore or filter the labels you don't need: doc = nlp(SOME_TEXT)
ents = [ent for ent in list(doc.ents) if ent.label_ not in LIST_OF_FILTERS]
You can check the multiple NER demo on how to do that. From here you can combine multiple components to include your new labels |
Beta Was this translation helpful? Give feedback.
Hi, @aniyyanz08
You can safely ignore or filter the labels you don't need:
You can check the multiple NER demo on how to do that. From here you can combine multiple components to include your new labels