Noob: High-level usage question about entity recognition vs entity linking #8874
Replies: 1 comment 6 replies
-
Hi! You'd probably benefit from some more generic tutorials on some of these concepts, to get a better grasp of different NLP tools and how they relate to your use-case. That said, I'll try to add my 2 cents...
This is a classic NER problem, in which it doesn't really matter whether an entity ("Obama", "John") is broadly known or not. You'd tag both as "Person". The follow-up task of Entity Linking will worry about normalizing these to known identifiers (if possible at all). "School" and "Work" wouldn't typically be named entities though, as they are just common words. You didn't specify what you meant by "transfer learning". Typically, you'll want to train an NER model from scratch with your specific label/annotation scheme. You can have a look at using a transformer-based model to benefit from language model pretraining though: https://spacy.io/usage/embeddings-transformers
Yes, this is an entity linking problem. Either your knowledge base will need to know about the different possible variants/synonyms, or you'll have to implement some kind of heuristic / fuzzy matching to identify likely entities, given a mention that's not in the KB. Ultimately & ideally, an entity linking step will map all occurrences & variants to the same unique ID. cf https://github.com/explosion/projects/tree/v3/tutorials/nel_emerson for more background & example implementation.
I'm not sure I understand the question, but yes you can use DBPedia or Wikidata or any other existing knowledge base - either as such or as a basis to expand upon for your specific use-case.
This is not an "entity linking" problem, but a "relation extraction" problem. When a sentence reads "X is the enemy of Y" or "A works at B", you'd be able to link two entities together with a specific relationship. cf also https://github.com/explosion/projects/tree/v3/tutorials/rel_component |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
spaCy is amazing but I wondered if I could get a steer on how to break down my problem into spaCy solution/feature domains so that I can tackle an engineering prototype. I realize I may have bitten off more than I can chew, but such is life.
Users write a daily journal which recognizes entities in their personal world. These are a mix of real-world objects like London and referents that only have meaning in the context of their journal, e.g. John, Bubba, School, Work. These referents also have a specialized labeling scheme (e.g. Nature, Person, Group, Self). Is this a transfer learning custom NER problem?
Users have various eponyms for entities. E.g., 'John' is sometimes 'Jonny', or 'John Smith'. 'Bubba' is sometimes 'Grandmother'. 'Work' is sometimes 'Acme Inc'. Is this an entity linking problem? These eponyms are limited to their private discourse (for now).
Since some of these are real world entities (e.g. Acme Inc, London), is this a problem in using a base knowledge graph such as DBPedia to underpin the individual user's entity linker?
Lastly, the entities in the user's personal journal world have a specific set of social membership relations between them. E.g., John Smith belongs to Acme Inc. Bubba is John Smith's grandmother. The Cybermen are the enemy of John Smith. Is this a custom entity linking problem? I am a bit confused about using spaCy for deducing the relationship between entities beyond disambiguating eponyms.
Beta Was this translation helpful? Give feedback.
All reactions