
AriLink is a telephony management system built on Asterisk's ARI (Asterisk REST Interface). It provides voice call handling, transcription, and PBX control capabilities. The system combines WebSockets, RTP, and speech-to-text integration to create a modern, feature-rich telephony solution.
- ๐ Call Management - Handles incoming and outgoing calls through Asterisk PBX
- ๐๏ธ Speech-to-Text - Real-time transcription using local AI models (Parakeet, Whisper) or Google Cloud
- ๐ Bridge Management - Creates and manages voice bridges for connecting multiple channels
- ๐ฅ Contact Recognition - Supports voice-activated dialing using a contacts database
- ๐ก External Media Channels - Supports external media integration for advanced use cases
- ๐ WebSocket Interface - Provides real-time updates and control via WebSockets
- ๐ Automatic Fallback - Seamlessly switches to backup transcription services if primary fails
The system is built on TypeScript and Node.js with a modular architecture supporting multiple concurrent calls:
flowchart TD
A[๐ Incoming Call 1] --> B[CallSessionManager]
C[๐ Incoming Call 2] --> B
B --> D[Session 1: Bridge 1]
B --> E[Session 2: Bridge 2]
D --> F[External Media 1]
E --> G[External Media 2]
F --> H[๐๏ธ Transcriber]
G --> H
H -->|routed by session ID| D
H -->|routed by session ID| E
๐ฎ AriControllerServer
The main controller that interfaces with Asterisk PBX:
- Manages call flows, bridges, and DTMF input
- Handles Stasis application events (start, end)
- Provides WebSocket server for client connections
- Manages contact lookups for voice-activated dialing
๐ค AriTranscriberServer
Provides real-time speech transcription:
- Connects to configurable transcription services (local or cloud)
- Processes RTP audio streams
- Transmits transcription results via WebSockets
- Supports customizable language and model settings
- Automatic fallback to backup services on failure
๐ก RTP UDP Server
Handles the real-time audio streaming:
- Processes incoming RTP packets from Asterisk
- Handles audio format conversion for transcription
๐ฃ๏ธ Transcription Providers
Multiple transcription backend support:
- Local providers: Parakeet TDT, Whisper (runs on your GPU)
- Cloud provider: Google Speech-to-Text API (optional)
- Handles streaming transcription with automatic restarts
- Manages audio chunking for optimal performance
- Provides both interim and final transcription results
- Automatic failover between services
The system uses environment variables for configuration, including:
| Category | Variables |
|---|---|
| PBX | PBX IP address, login credentials |
| WebSocket | Server ports, external host information |
| Transcription | Language settings, model configuration |
| Telephony | Provider settings, phone numbers |
- Set up FreePBX server:
- ๐ฆ New installation? FreePBX Installation Guide - VM setup and FreePBX installation
- โ๏ธ Already installed? FreePBX ARI Configuration - Configure for AriLink
- Install UV (Python package manager):
# Windows PowerShell irm https://astral.sh/uv/install.ps1 | iex
-
Setup Transcription Service (local speech recognition):
Parakeet (Recommended - fastest):
cd transcription-services/parakeet-service uv venv uv pip install -r requirements.txtOR Whisper (alternative):
cd transcription-services/whisper-service uv venv uv pip install -r requirements.txt -
Configure environment variables in
.envfile:TRANSCRIPTION_SERVICES=ws://localhost:5000
See
.env.examplefor all options and fallback configuration. -
Configure contacts in
tools/contacts.jsonfor voice-activated dialing
-
Start Transcription Service (in terminal 1):
For Parakeet:
cd transcription-services/parakeet-service start-service.batOR for Whisper:
cd transcription-services/whisper-service start-service.batFirst run will download the model (~800MB for Whisper, ~600MB for Parakeet)
-
Start AriLink Server (in terminal 2):
npm start
See Transcription Services Guide for all configuration options including fallbacks.
- Asterisk PBX with ARI enabled
- Node.js and TypeScript
- Transcription Service - choose one:
- Local: Parakeet TDT 0.6B (RECOMMENDED) or Whisper
- Cloud: Google Cloud Speech API credentials (optional)
- Various NPM packages including:
ari-client, @google-cloud/speech, ws, express, dotenv
- ๐ง Enhanced typing for TypeScript
- ๐ฅ๏ธ Web UI for monitoring and management
- โ
Additional speech recognition providersDONE: Local Whisper model integrated! - ๐ Additional providers: Scribe from ElevenLabs, Azure Speech
- ๐ Call analytics and reporting features
- ๐พ Database persistence for call records and transcriptions
This project is licensed under a Non-Commercial License (MIT-Based) - see the LICENSE file for details.
- โ Free for non-commercial use - Use, modify, and distribute for personal and educational purposes
- ๐ผ Commercial use requires permission - Contact for commercial licensing
- ๐ง Get in touch: Discord:
alexispace
Key Points:
- The software is provided "as is" without warranty
- Attribution is required in all copies
- Commercial use requires explicit permission from the copyright holder
For the full license text, please refer to the LICENSE file.
Built with โค๏ธ for modern telephony solutions