A Next.js-based web-app for conversational AI agents, built with Agora's Real-Time Communication SDK.
- Guide.md on how to build this application from scratch.
- User Interaction Diagram for how the application interacts with the different services.
Before you begin, ensure you have the following installed:
You must have an Agora account and a project to use this application.
- Clone the repository:
git clone https://github.com/AgoraIO-Community/conversational-ai-nextjs-client
cd conversational-ai-nextjs-client
- Install dependencies:
pnpm install
- Create a
.env.local
file in the root directory and add your environment variables:
cp .env.local.example .env.local
The following environment variables are required:
NEXT_PUBLIC_AGORA_APP_ID
- Your Agora App IDNEXT_PUBLIC_AGORA_APP_CERTIFICATE
- Your Agora App CertificateNEXT_PUBLIC_AGORA_CONVO_AI_BASE_URL
- Agora Conversation AI Base URLNEXT_PUBLIC_AGORA_CUSTOMER_ID
- Your Agora Customer IDNEXT_PUBLIC_AGORA_CUSTOMER_SECRET
- Your Agora Customer SecretNEXT_PUBLIC_AGENT_UID
- Agent UID (defaults to "Agent")
NEXT_PUBLIC_LLM_URL
- LLM API endpoint URLNEXT_PUBLIC_LLM_TOKEN
- LLM API authentication tokenNEXT_PUBLIC_LLM_MODEL
- LLM model to use (optional)
Choose one of the following TTS providers:
NEXT_PUBLIC_TTS_VENDOR=microsoft
NEXT_PUBLIC_MICROSOFT_TTS_KEY
- Microsoft TTS API keyNEXT_PUBLIC_MICROSOFT_TTS_REGION
- Microsoft TTS regionNEXT_PUBLIC_MICROSOFT_TTS_VOICE_NAME
- Voice name (optional, defaults to 'en-US-AndrewMultilingualNeural')NEXT_PUBLIC_MICROSOFT_TTS_RATE
- Speech rate (optional, defaults to 1.0)NEXT_PUBLIC_MICROSOFT_TTS_VOLUME
- Volume (optional, defaults to 100.0)
NEXT_PUBLIC_TTS_VENDOR=elevenlabs
NEXT_PUBLIC_ELEVENLABS_API_KEY
- ElevenLabs API keyNEXT_PUBLIC_ELEVENLABS_VOICE_ID
- ElevenLabs voice IDNEXT_PUBLIC_ELEVENLABS_MODEL_ID
- Model ID (optional, defaults to 'eleven_flash_v2_5')
NEXT_PUBLIC_INPUT_MODALITIES
- Comma-separated list of input modalities (defaults to 'text')NEXT_PUBLIC_OUTPUT_MODALITIES
- Comma-separated list of output modalities (defaults to 'text,audio')
- Run the development server:
pnpm dev
- Open your browser and navigate to
http://localhost:3000
to see the application in action.
This project is configured for quick deployments to Vercel.
This will:
- Clone the repository to your GitHub account
- Create a new project on Vercel
- Prompt you to fill in the required environment variables:
- Required: Agora credentials (
NEXT_PUBLIC_AGORA_APP_ID
,NEXT_PUBLIC_AGORA_APP_CERTIFICATE
, etc.) - Required: LLM API key (
NEXT_PUBLIC_LLM_API_KEY
) - OpenAI API key by default - Required: Either Microsoft TTS key (
NEXT_PUBLIC_MICROSOFT_TTS_KEY
) or ElevenLabs API key (NEXT_PUBLIC_ELEVENLABS_API_KEY
) - Other variables have defaults if values are not provided
- Required: Agora credentials (
- Deploy the application automatically
Male voices:
- en-US-AndrewMultilingualNeural (default)
- en-US-ChristopherNeural (casual, friendly)
- en-US-GuyNeural (professional)
- en-US-JasonNeural (clear, energetic)
- en-US-TonyNeural (enthusiastic)
Female voices:
- en-US-JennyNeural (assistant-like)
- en-US-AriaNeural (professional)
- en-US-EmmaNeural (friendly)
- en-US-SaraNeural (warm)
Try voices: https://speech.microsoft.com/portal/voicegallery
Try voices: https://elevenlabs.io/app/voice-lab
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
The application provides the following API endpoints:
- Endpoint:
/api/generate-agora-token
- Method: GET
- Query Parameters:
uid
(optional) - User ID (defaults to 0)channel
(optional) - Channel name (auto-generated if not provided)
- Response: Returns token, uid, and channel information
- Endpoint:
/api/invite-agent
- Method: POST
- Body:
{
requester_id: string;
channel_name: string;
input_modalities?: string[];
output_modalities?: string[];
}
- Endpoint:
/api/stop-conversation
- Method: POST
- Body:
{
agent_id: string;
}