Skip to content

Commit f441091

Browse files
committed
initial code commit
1 parent 60ba6bd commit f441091

File tree

4 files changed

+489
-0
lines changed

4 files changed

+489
-0
lines changed

Cargo.toml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
[package]
2+
name = "bevy-deepgram"
3+
version = "0.1.0"
4+
edition = "2021"
5+
6+
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
7+
8+
[dependencies]
9+
# main game dependencies
10+
bevy = "0.7"
11+
heron = { version = "3.0.0", features = ["2d"] }
12+
13+
# microphone input dependency
14+
portaudio = "0.7.0"
15+
16+
# async runtime dependencies
17+
futures = "0.3.21"
18+
tokio = { version = "1.17.0", features = ["macros", "rt", "rt-multi-thread"] }
19+
20+
# websocket dependencies
21+
http = "0.2.6"
22+
tokio-tungstenite = { version = "0.15.0", features = ["native-tls"] }
23+
tungstenite = "0.14.0"
24+
25+
# utility dependencies
26+
crossbeam-channel = "0.5.4"

README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# bevy-deepgram
2+
3+
This is essentially a tech-demo showing how one could integrate Deepgram Automatic Speech Recognition (ASR)
4+
and the Bevy game engine. You can control the Bevy icon by saying "up", "down", "left", or "right" to jump
5+
in that direction. There is an "enemy" which moves back and forth and you can collide with. If you fall
6+
off the bottom of the screen, you "die" and are "respawned" in the center of the screen, vertically.
7+
8+
As a tech-demo, this is pretty complete, but there are many TODOs noted in the comments in the code. To run,
9+
set a `DEEPGRAM_API_KEY` environment variable, and simply do:
10+
11+
```
12+
cargo run
13+
```
14+
15+
If things aren't working with the ASR, it may be because your microphone's audio format is different than the
16+
hardcoded values. This demo expects 44100 Hz floating point PCM audio coming from the microphone. Dynamically
17+
choosing the audio format is one of the big TODOs... The game also requires a large 1920x1080 window to work
18+
correctly - reasonable asset and window scaling is another big TODO - in principle, from the Bevy docs, it
19+
looks like this should work like in other engines (like Unity/Godot/etc), but I did not get it working yet.
20+
21+
## A Word On Dependencies.
22+
23+
First of all, I found that I needed to install some development libraries that
24+
I was not expecting:
25+
26+
```
27+
sudo apt-get install libasound2-dev libudev-dev
28+
```
29+
30+
With that out of the way, these are the main Rust/Cargo dependencies:
31+
32+
* `bevy`: the game engine
33+
* `heron`: a physics engine and wrapper around `bevy_rapier` providing a simpler API
34+
* `portaudio`: used for microphone input
35+
* `tokio_tungstenite`/`tungstenite`: used to connect to Deepgram via websockets
36+
* `tokio`: used to create an async runtime for the websocket handling
37+
38+
I chose `heron` for the physics engine as it was easier to setup and get working than `bevy_rapier` and felt
39+
much more intuitive. It has limitations for sure, I see no way to directly apply forces and impulses,
40+
but this can be effectively achieved by directly modifying velocities and accelerations. Overall, the
41+
Components `heron` introduces map very well to similar physics engines used in Unity/Godot/etc.
42+
43+
`portaudio` was a clear choice for the microphone input, and there was a nice guide that I followed
44+
to do this part (the guide is linked in the comments actually).
45+
46+
For the websockets, things got a bit tricky. I did not want to introduce an async runtime, and
47+
even got a prototype working without one, but it had severe limitations (namely lag and the potential
48+
to block ASR indefinitely). These limitations stemmed from the fact that doing `socket.read_message()`
49+
is a blocking call. This bugs me as regular channels (and `crossbeam` channels) have a `try_recv()`
50+
method which is not blocking, and having similar functionality for vanilla `tungstenite` websockets
51+
would allow this whole project to work without a need for any async runtime. However, here we are!

assets/icon.png

15.5 KB
Loading

0 commit comments

Comments
 (0)