Variability of words / phrases that are matching with specific label #7570
Replies: 1 comment 8 replies
-
Whether you can get good results or not depends on what kind of situation you want to recognize street addresses in. Are you recognizing streets addresses written in isolation? Or do you want to parse a whole address, with street, city, country, and so on, and isolate the street part? In that case the pretrained spaCy models won't help you much because they're trained on text like newspaper articles that uses complete sentences. But you can train a custom model that can learn how commas and words like "City" or "St." are significant. On the other hand, if you're recognizing addresses in sentences ("Mr. Smith lives at 2 3/8 Strawberry Lane..."), the existing models can be a good starting point. Either way, your current training data is unhelpful - the point of training data is to show the kind of things you want to find in in a wide variety of situations, so repeating the same template over and over again won't work. What kind of text do you want to use your address matcher on? Do you want to parse the address into parts or just recognize it? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to make a NER model that will recognise street addresses.
That kind of data have a big variability, for example.
[8237 Monroe Drive
Bountiful, UT 84010
16 Dogwood Ave.
Grand Island, NE 68801
634 Poplar Ave.
Elyria, OH 44035
79 Sheffield Dr.
Cranford, NJ 07016
97 Colonial Dr.
Salem, MA 01970
211 Oakland St.
Yakima, WA 98908]
...
+much more names (that are depending on country etc.)
...
And if I want to train my model for street recognition:
Is it possible to get good results?
I mean, street names can have 1-4 words in them, they can have numbers etc.
Is it possible to get good results because Ill always get different street addresses?
Beta Was this translation helpful? Give feedback.
All reactions