I combined Whispher + CLIPort for spoken language robotic manipulation #400
daniellawson9999
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Recently, there has been has been advances in language-conditioned robotic manipulation, with models such as CLIPort, which combines CLIP and Transporter networks to perform language conditioned pick and place tasks.
I combined Whispher with CLIPort in Colab for spoken language robotic manipulation. A video can be found here: https://twitter.com/danielblawson9/status/1584280688877936641?s=20&t=UsOS8k7evLj5Esuly0Dv3Q
And notebook:
https://colab.research.google.com/drive/1rtHk9a82xqX5zHs3CgSrgIU36GwqU1HQ?usp=sharing
There are some limitations, as the trained CLIPort only responds to language in a specific form. I built upon a notebook provided with the Socratic Model paper.
It would be interesting to try this with commands in non-english spoken language, and full Socratic model setup,where ViLD is used to describe the objects in the scene, which is fed to a language model along with a language query specifying a task. The LM then breaks the task down into steps which are fed to CLIPort.
Beta Was this translation helpful? Give feedback.
All reactions