What is the order of pattern matching? #13219
Unanswered
dglopes
asked this question in
Help: Coding & Implementations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The code below is an attempt to go through a phrase and look for spans that match the patterns. I have many patterns to perform the search, but here I am only showing one group that inserts the label "COMPONENT" into the Span.
See that I'm storing the patterns in a dictionary with the "padroes" variable:
Given the text:
text = "Os edifícios multifamiliares devem ser providos de proteção contra descargas atmosféricas, atendendo ao estabelecido na ABNT NBR 5419 e demais Normas Brasileiras aplicáveis, nos casos previstos na legislação vigente."
The expected results is:
But it only returns:
It is using the 3th pattern on the dictionary and as Im using a function to make a search in the pattern order, I expected the first pattern to be used was: [{"POS": "NOUN"},{"POS": "ADP"},{"POS": "NOUN"},{"POS": "ADJ"}]
Here the code Im using:
My problem is that in the code above, even if I arrange the patterns in the order I want, the function to perform the match "skips" some patterns and seems to only return the result obtained by the pattern [{"POS": "NOUN"} ,{"POS": "ADP"},{"POS": "NOUN"}], even if the order of it and other patterns changes in the dictionary. Only when it is deleted does the first pattern participate in the search.
After some research I found the argument "greedy" and it resolve the problem.
BUT...
When using the "greedy" argument in matcher.add, I can change the result... in a strange way.
When using "greedy= "FIRST", it searches for the first match in the sentence, but also uses the largest (correct) pattern and does not repeat the mentioned error (!)
When using "greedy = "LONGEST", it searches for the largest match, that is, it solves my search.
Beta Was this translation helpful? Give feedback.
All reactions