Multiple roots in a dependency tree issue #10627
-
I have a Thai dataset consisting of 13 000 sentences and I have trained the POS tagger & parser pipeline using it. Sometimes the parser would split the original sentence into two at the inference step, and I was wondering if there is a way to force the parser to give a dependency tree with exactly one root? The documentation mentions
If yes, would there be any side effects from enforcing sentence boundaries like this? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Yes, you can add a custom component if you want to add sentence boundaries so that the parser treats each doc as a single sentence: https://spacy.io/usage/processing-pipelines#custom-components The parses might be odd if the texts are very different from the kinds of sentences seen during training, but it should work technically. |
Beta Was this translation helpful? Give feedback.
Yes, you can add a custom component if you want to add sentence boundaries so that the parser treats each doc as a single sentence: https://spacy.io/usage/processing-pipelines#custom-components
The parses might be odd if the texts are very different from the kinds of sentences seen during training, but it should work technically.