Named Entity Recognition and Extraction Relation #10159
Replies: 2 comments 3 replies
-
The relation extraction sample project has documentation on how to use it. Unfortunately I don't think relation extraction will actually help with your problem - it's used to find facts implied by sentence structure, like:
Here the model learns that "is in" and "is located in" are equivalent. This is a simple case but the idea is to learn more complicated ones. However it relies on well formed sentences, or a certain similarity in surrounding text. I don't think there is any of that in the example document you had? You could probably make training data and train a model but I wouldn't expect it to work very well. |
Beta Was this translation helpful? Give feedback.
-
@imhans33 You may want to try out the crawler module on Haystack. In particular, see the filter_urls. Documentation is here: https://haystack.deepset.ai/reference/crawler What kind of documentation are you looking for? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
I have a specific use case for identifying some specific url's from a web page source data like "Book Tickets". Following is an example of the data i am dealing
These two urls
<a href="https://thebroadstage.secure.force.com/ticket/#/instances/a0F5G00000L12hAUAR"><a href="https://thebroadstage.secure.force.com/ticket/#/instances/a0F5G00000L12hKUAR">
are related to "Buy Tickets" . So Is it possible to detect those urls which are associated with"Buy Tickets"
. When searched through the features of spacy i found NER + Relation Entity Extractor will solve this kind of problems. If so how we can create the dataset, train the custom model. Any documentation available for custom modelling these specific use cases.Also there can be several
<a href
tags in one web page source data(as all knows it) but only these tags which are closer toBuy Ticket
are required ones. So in these cases should we tag all the<a href
as entities or only the required ones to be tagged.Beta Was this translation helpful? Give feedback.
All reactions