Using parsed dependencies in SpanCategorizer suggester with transformers #10139
-
Hello! I am implementing a custom suggester, that uses dependency parsing information (e.g., noun chunks) for suggesting spans to the SpanCategorizer. I basically follow the config suggested here: and added:
as well as
In my custom suggester, I still get an error: It has been suggested here: that you need to add
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Noun chunks require both For German, it's: pipeline = ["transformer", "morphologizer", "parser", "transformer_spancat", "spancat"]
...
[components.spancat.model.tok2vec]
@architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "transformer_spancat" You don't need to add This is not anything we published officially and it's messy and inefficient, but I implemented a noun chunk suggester as a proof-of-concept here, which does noun chunk +/- two tokens as the suggested spans, look in |
Beta Was this translation helpful? Give feedback.
Noun chunks require both
token.pos
and a parse, and POS usually comes from eithertagger+attribute_ruler
ormorphologizer
in the provided trained pipelines. (Sorry, the error message looks like it's gotten a bit out-of-date.)For German, it's:
tok2vec/transformer
,morphologizer
,parser
. Both of those listen to the same transformer and you don't want to duplicate thetransformer
component that many times withreplace_listeners
(it would be both huge and slow), so instead it would be better to use a custom name for the spancat'stransformer
component and use that asupstream
for the spancat listener instead. Put all the new components after the existing frozen components. So: