spaCy as SRL? #2336
Replies: 7 comments
-
Issue is that semantic roles depend on sentence semantics; of course related to dependency parsing, but requires more than pure syntactical information. A good SRL should contain statistical parts as well to correctly evaluate the result of the dependency parse. In your example sentence there are 3 NPs. There's no good way to distinguish if the last 2 NPs are AGENT, INSTRUMENT or PLACE from only the dependency parse result. https://web.stanford.edu/~jurafsky/cl01.pdf explains it well. Short answer: Purely on dependency parse no, you need a semantical component as well. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Thank you @DuyguA for your contribution. To make my question more clear: Imagine I have a full set of spaCy's processing pipeline including tokenization, tagging, parsing, and finally but not necessarily NER. One step before the NER I want to add a custom pipeline step: SRL. But to implement the SRL pipeline step, I have many choices including using spacy's parser component BUT trained in a different way: trained on SRL data. Now My Question:Given having required information available, can spaCy's parser component be trained on SRL data and used for SRL? Thanks a lot! |
Beta Was this translation helpful? Give feedback.
-
OK, now I understood the question completely; you want to add SRL to the pipeline. Short answer: yes, why not 😁 One can train the tagger on the result of the dependency parser + POS Tags directly. If you need annotated data, Conll 2005 and Conll 2012 has it. I googled Conll12 for you: http://conll.cemantix.org/2012/task-description.html check out the dataset 😉 I use https://homes.cs.washington.edu/%7Eluheng/files/acl2017_hllz.pdf ; end-to-end SRL in my commercial work and get very satisfying results. It's basically a decoder onto a BiLSTM. However, since pipeline already includes syntactical parse results, I think you can get even better results. Feel free to ping me for further discussion ✋ |
Beta Was this translation helpful? Give feedback.
-
@honnibal , is implementing http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval034.pdf on the CoNLL2009 data still the recommended approach here? |
Beta Was this translation helpful? Give feedback.
-
@grivaz I haven't done recent literature review, but that at least still looks like a good approach. If you wanted to be really diligent, you could email the first author and check whether they would still recommend it? I'm sure they'd know whether it's still current, or whether there's now a better approach. |
Beta Was this translation helpful? Give feedback.
-
@grivaz the link http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval034.pdf seems to be broken, do you have an updated one? |
Beta Was this translation helpful? Give feedback.
-
@honnibal I came across this work https://arxiv.org/pdf/1804.08199.pdf (LISA- SOTA in SRL that uses deep nets with lingusitic information.) which seems to be a good fit for spacy's pipeline. I have not thought about the output representation and other details yet, but will get to that once I get a better understanding of this paper. I believe the author's implementation has Tensorflow as a dependency, would it be okay to proceed ahead with that as is? Would like to get your thoughts on this before getting started. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
As stated here, spaCy's parser can be used for purposes other than syntactical parsing. Can the parser be used as a Semantic Role Labeler (SRL)?
The example presented in above link, only shows single tokens as arguments. Can the parser detect multiple tokens as single argument, as below?
Beta Was this translation helpful? Give feedback.
All reactions