deepgram-devs
diff --git a/‎Cargo.toml‎
Lines changed: 26 additions & 0 deletions b/‎Cargo.toml‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 51 additions & 0 deletions b/‎README.md‎
Lines changed: 51 additions & 0 deletions
diff --git a/‎assets/icon.png‎
15.5 KB b/‎assets/icon.png‎
15.5 KB
@@ -0,0 +1,26 @@
+[package]
+name = "bevy-deepgram"
+version = "0.1.0"
+edition = "2021"
+
+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
+
+[dependencies]
+# main game dependencies
+bevy = "0.7"
+heron = { version = "3.0.0", features = ["2d"] }
+
+# microphone input dependency
+portaudio = "0.7.0"
+
+# async runtime dependencies
+futures = "0.3.21"
+tokio = { version = "1.17.0", features = ["macros", "rt", "rt-multi-thread"] }
+
+# websocket dependencies
+http = "0.2.6"
+tokio-tungstenite = { version = "0.15.0", features = ["native-tls"] }
+tungstenite = "0.14.0"
+
+# utility dependencies
+crossbeam-channel = "0.5.4"
@@ -0,0 +1,51 @@
+# bevy-deepgram
+
+This is essentially a tech-demo showing how one could integrate Deepgram Automatic Speech Recognition (ASR)
+and the Bevy game engine. You can control the Bevy icon by saying "up", "down", "left", or "right" to jump
+in that direction. There is an "enemy" which moves back and forth and you can collide with. If you fall
+off the bottom of the screen, you "die" and are "respawned" in the center of the screen, vertically.
+
+As a tech-demo, this is pretty complete, but there are many TODOs noted in the comments in the code. To run,
+set a `DEEPGRAM_API_KEY` environment variable, and simply do:
+
+```
+cargo run
+```
+
+If things aren't working with the ASR, it may be because your microphone's audio format is different than the
+hardcoded values. This demo expects 44100 Hz floating point PCM audio coming from the microphone. Dynamically
+choosing the audio format is one of the big TODOs... The game also requires a large 1920x1080 window to work
+correctly - reasonable asset and window scaling is another big TODO - in principle, from the Bevy docs, it
+looks like this should work like in other engines (like Unity/Godot/etc), but I did not get it working yet.
+
+## A Word On Dependencies.
+
+First of all, I found that I needed to install some development libraries that
+I was not expecting:
+
+```
+sudo apt-get install libasound2-dev libudev-dev
+```
+
+With that out of the way, these are the main Rust/Cargo dependencies:
+
+* `bevy`: the game engine
+* `heron`: a physics engine and wrapper around `bevy_rapier` providing a simpler API
+* `portaudio`: used for microphone input
+* `tokio_tungstenite`/`tungstenite`: used to connect to Deepgram via websockets
+* `tokio`: used to create an async runtime for the websocket handling
+
+I chose `heron` for the physics engine as it was easier to setup and get working than `bevy_rapier` and felt
+much more intuitive. It has limitations for sure, I see no way to directly apply forces and impulses,
+but this can be effectively achieved by directly modifying velocities and accelerations. Overall, the
+Components `heron` introduces map very well to similar physics engines used in Unity/Godot/etc.
+
+`portaudio` was a clear choice for the microphone input, and there was a nice guide that I followed
+to do this part (the guide is linked in the comments actually).
+
+For the websockets, things got a bit tricky. I did not want to introduce an async runtime, and
+even got a prototype working without one, but it had severe limitations (namely lag and the potential
+to block ASR indefinitely). These limitations stemmed from the fact that doing `socket.read_message()`
+is a blocking call. This bugs me as regular channels (and `crossbeam` channels) have a `try_recv()`
+method which is not blocking, and having similar functionality for vanilla `tungstenite` websockets
+would allow this whole project to work without a need for any async runtime. However, here we are!