Using spaCy for voice assistant commands #10690
-
I am using spaCy with VOSK for a smart home voice assistant project. So I get some compound commands like "Turn on the bedroom lights at noon and play some pop music now". I want to extract some basic elements of the sentence to come up with a clear command for the hardware part of the system. (like "what to do", "Where", "When", and "what are the objects to do the verbs on"). I am familiar with token.dep and token.pos for identifying verbs and objects but in a compound sentence there are two "pobj"s and I don't know how to identify each "pobj" verb. for example in the sentence above, "lights" and "music" both have the token.dep of "pobj" and this makes it hard to distinguish their verbs which are "turn on" and "play" respectively. Furthermore, spaCy identifies "at noon" as an adverb (which it is) and this can be misjudged with words like "quickly" which are also adverbs. Also, it has some problems with identifying phrasal verbs. It does not identify "Turn on" as a verb, But it identifies "Turn" as a verb. So what do you think that I can do to get meaningful commands from these sentences. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Many chatbot systems work by introducing the concept of an "intent". Usually, a bot only has a few commands that it can handle, and a user's utterance should fit one of those. That's why many virtual assistants have a classification step at each turn to parse the intent. An intent in your case could be "activate". If we detect that the intent is "activate" and that there's a named entity in the text that refers to "lights" then we might also have enough knowledge for the bot to proceed. That way, you wouldn't need to worry about the statistical accuracy of the part of speech detection. Note, this is how Rasa Open Source works. The only downside of this approach is that you will need to collect intent/NER data for this to work. But if you're building a bot for yourself that shouldn't be too hard. |
Beta Was this translation helpful? Give feedback.
Many chatbot systems work by introducing the concept of an "intent". Usually, a bot only has a few commands that it can handle, and a user's utterance should fit one of those. That's why many virtual assistants have a classification step at each turn to parse the intent.
An intent in your case could be "activate". If we detect that the intent is "activate" and that there's a named entity in the text that refers to "lights" then we might also have enough knowledge for the bot to proceed. That way, you wouldn't need to worry about the statistical accuracy of the part of speech detection.
Note, this is how Rasa Open Source works. The only downside of this approach is that you will need to co…