Nabu Agent

A voice-controlled AI agent system that processes audio commands and executes actions across multiple platforms including Spotify, Home Assistant, and web search. The agent uses speech-to-text, translation, and classification to route commands to the appropriate handler.

Features

Speech-to-Text: Converts audio input to text using Faster Whisper
Multi-language Support: Automatic language detection and translation to English
Smart Command Classification: Routes commands to appropriate handlers:
- Spotify Integration: Play music, artists, albums, playlists, and radio
- Home Assistant: Control smart home devices
- Web Search: Internet search via SearxNG
- Party Commands: Pre-established custom commands with witty responses
LangChain Integration: Uses LangChain for LLM orchestration and MCP adapters for Home Assistant
Workflow Visualization: Built with LangGraph for complex workflow management

Architecture

The system uses a workflow-based architecture with the following nodes:

STT (Speech-to-Text): Transcribes audio input using Faster Whisper
Translator: Detects language and translates to English
Enrouting Question: Classifies the command type
Command Handlers:
- Pre-established commands (party mode)
- Internet search
- Spotify command (with sub-workflow)
- Home Assistant command
Finish Action: Prepares and translates the final response

Installation

Prerequisites

Python 3.12 or higher
UV package manager (recommended) or pip

Install Dependencies

Using UV:

uv sync

Using pip:

pip install -e .

For development:

uv sync --dev

Configuration

Create a .env file in the project root with the following environment variables:

Required Environment Variables

# LangChain Configuration
LANGCHAIN_API_KEY=...              # LangChain API key for tracing
LANGCHAIN_TRACING_V2=true          # Enable LangChain tracing
LANGCHAIN_PROJECT=nabu-agent       # Project name for LangChain

# LLM Configuration
LLM_BASE_URL=...                   # Base URL for the LLM API
LLM_API_KEY=...                    # API key for LLM access
LLM_MODEL=Qwen3-4B                 # LLM model to use (e.g., Qwen3-4B)

# Faster Whisper (STT) Configuration
FASTER_WHISPER_MODEL=...           # Whisper model size (e.g., base, small, medium, large)
FASTER_WHISPER_USE_CUDA=false      # Set to 'true' to use CUDA acceleration

# Search Configuration
SEARX_HOST=...                     # SearxNG instance URL for web searches

# Spotify Configuration
SPOTIPY_CLIENT_ID=...              # Spotify API client ID
SPOTIPY_CLIENT_SECRET=...          # Spotify API client secret
SPOTIPY_REDIRECT_URI=https://127.0.0.1:1234  # OAuth redirect URI

# Home Assistant Configuration
HA_TOKEN=...                       # Home Assistant long-lived access token
HA_URL=...                         # Home Assistant instance URL (e.g., http://homeassistant.local:8123)

Environment Variable Details

LANGCHAIN_API_KEY: Get from LangSmith
LLM_BASE_URL: OpenAI-compatible API endpoint (e.g., local Ollama, OpenAI, etc.)
FASTER_WHISPER_MODEL: Choose from: tiny, base, small, medium, large-v2, large-v3
SEARX_HOST: URL to your SearxNG instance (self-hosted or public)
Spotify credentials: Get from Spotify Developer Dashboard
HA_TOKEN: Generate from Home Assistant: Profile → Security → Long-Lived Access Tokens

Usage

Command Line Interface

Run the agent with an audio file:

uv run nabu-agent /path/to/audio/file.wav

Programmatic Usage

import asyncio
from nabu_agent import execute_main_workflow

# Read audio file
with open("audio.wav", "rb") as f:
    audio_data = f.read()

# Execute workflow
result = asyncio.run(execute_main_workflow(audio_data))
print(result)

# Generate workflow visualization
result = asyncio.run(execute_main_workflow(audio_data, graph=True))
# This creates graph.png and full_graph.png

Command Examples

Spotify Commands

"Play music by The Beatles"
"Play the song Bohemian Rhapsody"
"Play the album Dark Side of the Moon"
"Play my Discover Weekly playlist"
"Play radio for Pink Floyd"

Home Assistant Commands

"Turn on the living room lights"
"What devices do I have?"
"Set the thermostat to 72 degrees"

Internet Search

"What's the weather today?"
"Search for Python tutorials"
"What's the latest news?"

Party Commands

"Tick-tock" (starts a countdown with an ominous sentence)

Development

Project Structure

nabu-agent/
├── src/nabu_agent/
│   ├── main.py                 # Entry point
│   ├── workflows/
│   │   ├── main/              # Main workflow
│   │   │   ├── workflow.py
│   │   │   ├── nodes.py
│   │   │   └── state.py
│   │   └── spotify_agent/     # Spotify sub-workflow
│   │       ├── workflow.py
│   │       └── nodes.py
│   ├── tools/
│   │   ├── agents.py          # LLM agents (STT, classifier, translator)
│   │   ├── spotify.py         # Spotify integration
│   │   └── web_loader.py      # Web search
│   ├── utils/
│   │   └── schemas.py         # Pydantic models
│   └── data/
│       └── preestablished_commands.py
├── tests/
├── pyproject.toml
└── README.md

Running Tests

pytest

Adding Pre-established Commands

Edit src/nabu_agent/data/preestablished_commands.py:

party_commands = {
    "tick-tock": "start a countdown and tell an ominous sentence.",
    "your-command": "description of what this command does"
}

Logging

Logs are written to nabu_agent_agent.log in the current directory. Check this file for detailed execution information and debugging.

Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests to ensure everything works
Submit a pull request

License

See LICENSE file for details.

Troubleshooting

Common Issues

Spotify device not found: Ensure you have an active Spotify device. The system looks for a device named "librespot" with specific ID. Update DEVICE_NAME and DEVICE_ID in src/nabu_agent/tools/spotify.py if needed.
Audio format errors: Ensure your audio files are in a compatible format (WAV, MP3, etc.)
Home Assistant connection: Verify your HA_URL includes the full URL with protocol and port (e.g., http://homeassistant.local:8123)
Language detection issues: The system translates all commands to English. If you're getting incorrect results, check the original_language in the logs.

Requirements

Core dependencies:

faster-whisper>=1.2.0 - Speech-to-text
langchain-community>=0.3.26 - LangChain community tools
langchain-mcp-adapters>=0.1.10 - MCP protocol adapters
langchain-openai>=0.3.33 - OpenAI LLM integration
langgraph>=0.4.8 - Workflow graph builder
pydantic>=2.11.7 - Data validation
python-dotenv>=1.1.1 - Environment management
spotipy>=2.25.1 - Spotify API client

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nabu Agent

Features

Architecture

Installation

Prerequisites

Install Dependencies

Configuration

Required Environment Variables

Environment Variable Details

Usage

Command Line Interface

Programmatic Usage

Command Examples

Spotify Commands

Home Assistant Commands

Internet Search

Party Commands

Development

Project Structure

Running Tests

Adding Pre-established Commands

Logging

Contributing

License

Troubleshooting

Common Issues

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
src/nabu_agent		src/nabu_agent
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
full_graph.png		full_graph.png
graph.png		graph.png
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Nabu Agent

Features

Architecture

Installation

Prerequisites

Install Dependencies

Configuration

Required Environment Variables

Environment Variable Details

Usage

Command Line Interface

Programmatic Usage

Command Examples

Spotify Commands

Home Assistant Commands

Internet Search

Party Commands

Development

Project Structure

Running Tests

Adding Pre-established Commands

Logging

Contributing

License

Troubleshooting

Common Issues

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages