Skip to content

alekszlat/A.R.E.S

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A.R.E.S – Artificial Reasoning Engineered System

Ares is a fully local voice assistant that combines:

  • 🎤 Whisper.cpp for speech-to-text (STT)
  • 🧠 Llama.cpp for natural language processing (LLM)
  • 🔊 Piper for text-to-speech (TTS)
  • 👂 OpenWakeWord for wake word detection ("Hey Ares")

Everything runs offline — no internet is required for processing.
Ares is designed to be modular and extendable to control smart devices or even robots.


✨ Features (Current Progress)

Wake Word ("Hey Ares/Ares")

  • Powered by OpenWakeWord
  • Starts listening only after hearing the wake phrase

Speech-to-Text (STT) (CPU)

  • Uses sounddevice + webrtcvad for smart recording (stops when you go quiet)
  • Transcribes audio with whisper-cli (from Whisper.cpp)

Local Language Model (LLM) (GPU)

  • Runs llama.cpp in server mode
  • Configurable system prompt → "Jarvis"-like personality

Text-to-Speech (TTS)

  • Piper HTTP server generates natural-sounding voices
  • Multiple voices available (e.g., en_US-bryce-medium)

Main Pipeline
Wake Word → Record → Transcribe → Send to LLM → Speak Response

Benchmarking

  • benchmark_ai.sh logs timings for STT, LLM, and TTS
  • Results are stored in latency.md

CI / Mock Mode

  • GitHub Actions run Ares in Mock Mode (no audio hardware required)
  • Simulates STT, LLM, and TTS responses for automated testing

Getting Started

1. Install dependencies

pip install -r requirements.txt

Also build:

2. Start LLM + TTS servers

./server/run_servers.sh

Say "Hey Ares", wait for the beep 🎵, then speak your command. Ares will listen, process locally, and respond with speech.

🧪 Development & Testing

Benchmark Latency

./docs/benchmark_ai.sh

📂 Project Structure

llmio/               # Input/output modules
 ├─ stt_whisper.py   # Speech-to-text
 ├─ tts_piper.py     # Text-to-speech
 ├─ llm_remote.py    # LLM client
 └─ wake_word.py     # Wake word listener

scripts/             # Helper scripts
 └─ run_servers.sh   # Start LLM and TTS servers

latency.md           # Benchmark results
ci_runner.py         # CI harness with mocks
main.py              # Main application loop

⚙️ Roadmap

Voice & Interaction

  • Custom wake word — trainable per device/user.
  • Custom voice — selectable TTS voice profiles.

Web & App Actions

  • Open websites & apps on command

Devices & I/O

  • Bluetooth device control — pair/connect/disconnect and volume controls.

Perception

  • Visual detection — optional camera input for object/face/basic scene cues.

Architecture/Security

  • Speaker recognition — per-user profiles for personalization/permissions.
  • Remote access (multi-device clients)

Hardware

  • Custom hardware build — mic array, LEDs, physical mute, action button.

UI/UX

  • User interface for better user experience

About

Simple personal voice assistant - Stage 1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors