Skip to content
Discussion options

You must be logged in to vote

I'd like to see how you're writing the Entity Rule for this and how you're adding the patterns to your pipeline, so I know what you've tried so far.

In the examples I've trying, numbers with slashes are consistently recognized as tokens, even when the phrase as a whole isn't recognized as a QUANTITY by the default NER. This should allow you to write an Entity Rule that matches a NUM followed by certain NOUNs, and labels them as QUANTITY.

For example, this example has no problem with the slash:

>>> nlp = spacy.load("en_core_web_lg")
>>> [(tok, tok.pos_) for tok in nlp("1 1/2 cups of water")]
[(1, 'NUM'), (1/2, 'NUM'), (cups, 'NOUN'), (of, 'ADP'), (water, 'NOUN')]

The forward slash (/) doe…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@robertpfaff
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / matcher Feature: Token, phrase and dependency matcher
2 participants