A real-time voice chat application that records audio, converts speech to text using Whisper, and generates responses using LLaMA models.
rtc_on/
├── src/ # Source code
│ ├── main.cpp # Main application entry
│ ├── core/ # Core functionality
│ │ ├── audio/ # Audio processing
│ │ │ ├── audio_recorder.{h,cpp} # PortAudio recording
│ │ │ └── dsp_utils.{h,cpp} # STFT/DSP processing
│ │ ├── llm/ # LLM functionality
│ │ │ ├── llm_runner.{h,cpp} # LLM model runner
│ │ │ ├── chat_template.{h,cpp} # Chat formatting
│ │ │ ├── text_prefiller.{h,cpp} # Text generation helpers
│ │ │ └── simple_token_generator.{h,cpp}
│ │ ├── stt/ # Speech-to-text
│ │ │ └── speech_to_text.{h,cpp} # Whisper model integration
│ │ └── tokenization/ # Tokenizer functionality
│ │ └── tokenizer_adapter.{h,cpp} # HuggingFace tokenizer wrapper
│ └── utils/ # Utility functions
│ ├── argmax_utils.h # Token selection utilities
│ └── keyboard_input.{h,cpp} # Input handling
├── models/ # Model files (gitignored)
│ ├── llm/ # LLM models
│ │ ├── llama3_2_bf16.pte # LLaMA model
│ │ └── tokenizer.json # LLM tokenizer
│ └── stt/ # Speech-to-text models
│ ├── encoder.pte # Whisper encoder
│ ├── decoder.pte # Whisper decoder
│ └── tokenizer.json # Whisper tokenizer
├── recordings/ # Audio recordings
├── scripts/ # Build scripts
│ └── build.sh # Build automation
├── third_party/ # External dependencies
│ └── executorch/ # ExecutorTorch framework
├── build/ # Build output (gitignored)
├── CMakeLists.txt # Build configuration
└── README.md # This file
-
Build the project:
./scripts/build.sh --clean --run
-
Run the application:
./build/rtc_on_workshops
-
Voice interaction:
- Type
record
to start recording - Speak your message
- Type
stop
to process and send to AI - Type
quit
to exit
- Type
- ExecutorTorch: ML model inference framework
- PortAudio: Cross-platform audio I/O library
- HuggingFace Tokenizers: Text tokenization
- C++20: Modern C++ standard
record
- Start voice recordingstop
- Stop recording and send to AItext
- Switch to text input modequit
- Exit application