Skip to content
Discussion options

You must be logged in to vote

Hi,

If you take a look at the tokens in the doc, you will see that "cannot" is separated into 2 tokens: 'can' and 'not'.

Since Matcher patterns are descriptions of tokens to find, your pattern will search for a single token 'cannot', followed by 'world'.

This pattern works in your case : pattern = [{"LOWER": "can"}, {"LOWER": "not"}, {"LOWER": "world"}]

The PhraseMatcher works because il will tokenize the text internally and produce patterns which respect the tokenizer behaviour.

When writing patterns for Matcher, you need to pay attention to the tokenization, especially when it comes to compound words.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by adrianeboyd
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / matcher Feature: Token, phrase and dependency matcher
2 participants