Skip to content
Discussion options

You must be logged in to vote

Have you double-checked that both C9120.d244 and C9120.d244.x are exactly one token with your tokenizer and that the other tokens have the expected POS?

As long as they're exactly one token, I don't think it's actually due to the . since {} should always match exactly one token, so it might be a different part of your pattern that's not matching like you expect.

With these kinds of short phrases, the tags are often incorrect because there's not enough context. In addition, if you're using one of the provided trained pipelines like en_core_web_sm, it may be because it's not the kind of data the tagger was trained on (mainly full newspaper-style sentences).

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@asquare
Comment options

Answer selected by asquare
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / matcher Feature: Token, phrase and dependency matcher
2 participants