Example.get_aligned_parse can return multiple dependency labels for a single token #9718
-
The documentation for With the default option of
I think most people would assume that you would get a single valid dependency label for each token. This causes much extra debugging work and headaches down the road. Seems worth mentioning in the documentation! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
This kind of dependency label is the expected output for the pseudo-projective dependency parsing algorithm (Nivre and Nilsson, 2005) cited in the method description for The idea is that the pseudo-projective version can be converted back to the original non-projective dependency tree in nearly all cases. If you strip the label after the There are no built-in options for other projectivization algorithms in spacy, but as long as you end up with projective trees, you can preprocess your data however you'd like before training the parser. |
Beta Was this translation helpful? Give feedback.
-
@adrianeboyd, thank you, your explanation is very helpful. An etiquette question. The GitHub new issue process invites us to submit suggestions for improving the documentation and that was the purpose of the issue I submitted. Is this kind of request to update the documentation helpful? I appreciate that you are all very busy and I don't want to needlessly take your time. |
Beta Was this translation helpful? Give feedback.
-
It is part of training. This is my specific situation:
I did delve into the parser code and was pleasantly surprised to see that the parser properly handles the || notation in the dependency labels. It was just really hard to find since you can't search on "||" in GitHub! |
Beta Was this translation helpful? Give feedback.
This kind of dependency label is the expected output for the pseudo-projective dependency parsing algorithm (Nivre and Nilsson, 2005) cited in the method description for
projectivize=True
. To be more specific, this method uses the "Head" decoration scheme. This is only an internal representation used within the parser, which converts it back to non-projective trees containing only the original labels for the finalDoc
annotation.The idea is that the pseudo-projective version can be converted back to the original non-projective dependency tree in nearly all cases. If you strip the label after the
||
separator, you could have a dependency tree that's projective and that contains the same l…