matching wildcard tokens that contain a period with dependency parser #10888
-
Hi, I'm trying to match tokens that contain a period against a wildcard pattern using the DependencyMatcher. The (simplified) example below works if the period in the middle token is replaced with an underscore, but not with a period. Two periods however work fine. I need to be able to handle middle ( Text (does not work) -
Text (works) -
Pattern -
Any suggestions? Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Have you double-checked that both As long as they're exactly one token, I don't think it's actually due to the With these kinds of short phrases, the tags are often incorrect because there's not enough context. In addition, if you're using one of the provided trained pipelines like |
Beta Was this translation helpful? Give feedback.
Have you double-checked that both
C9120.d244
andC9120.d244.x
are exactly one token with your tokenizer and that the other tokens have the expected POS?As long as they're exactly one token, I don't think it's actually due to the
.
since{}
should always match exactly one token, so it might be a different part of your pattern that's not matching like you expect.With these kinds of short phrases, the tags are often incorrect because there's not enough context. In addition, if you're using one of the provided trained pipelines like
en_core_web_sm
, it may be because it's not the kind of data the tagger was trained on (mainly full newspaper-style sentences).