NER Model to identify Product names #12810
Unanswered
yellowy
asked this question in
Help: Other Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello guys
First I'd like to explain the problem and how I'm trying to solve it.
In our company we regularely get new databases from new customers (and we wanna add their catalogue to our app), but sadly it's usually a hot mess of product names and different naming patterns for the same products.
My job is to harmonize it and usually we do it via GTIN(Global Trade Item Number) but sometimes they do not provide a GTIN in their database so I'd have to manually read it and find the corresponding GTIN for it in a different database and then manually override it. In total it's about 3million rows of information that I would have to read and that is of course not fun.
So my idea is to create a NER-model for the product name where I will label a few hundred (is that enough?) manually via tecoholic's NER annotator and then use that model to identify the product names from new datasets.
And then from there I can hopefully just build a lookup table for the productname (my idea for now is to create a dict to create a relationship between the gtin and the product name) to find the corresponding gtin.
Can anyone tell me if what I'm imagining makes sense? I'd love to hear critisismn or advice.
Thank you for taking your time!
Beta Was this translation helpful? Give feedback.
All reactions