Agora Conversational AI Custom LLM Node

This project implements a custom LLM-powered chat service using Express.js, supporting multiple LLM providers including OpenAI's API and DataStax Langflow, to create a custom LLM for use in the Agora Convo AI Engine. It supports both streaming and non-streaming responses, function calling capabilities, and includes RAG (Retrieval Augmented Generation) functionality.

This project implements basic tools and a tool calling mechanism. The tools use Agora Signaling Service to send messages into a real-time messaging channel.

LLM Provider Support

This service supports multiple LLM providers:

OpenAI Chat Completions API - The standard OpenAI chat completions endpoint
OpenAI Responses API - OpenAI's new Responses API with improved streaming
DataStax Langflow - Visual AI flow builder with DataStax integration

Langflow Integration

The service now includes full support for DataStax Langflow, allowing you to:

Connect to DataStax-hosted Langflow instances
Use custom AI flows built with Langflow's visual interface
Maintain session state across conversations
Execute function calls from within Langflow flows
Stream responses in real-time

To use Langflow, configure the following environment variables:

LANGFLOW_URL=https://api.langflow.astra.datastax.com/lf/your-instance
LANGFLOW_API_KEY=your_langflow_api_key
LANGFLOW_FLOW_ID=your_flow_id

Then update your route configuration to use the Langflow service instead of OpenAI.

Architecture

graph LR
    Client[Client] <--> |Voice/Text Stream| ConvoAI[Agora Convo AI]
    ConvoAI --> |ASR Text| Server[Express Server]
    Server --> |Auth| AuthMiddleware[Auth Middleware]
    AuthMiddleware --> ChatRouter[Chat Router]
    ChatRouter --> LLMService[LLM Service<br/>#40;OpenAI/Langflow#41;]
    LLMService --> |Get Context| RagService[RAG Service]
    RagService --> |Return Context| LLMService
    LLMService --> |System Prompt + RAG + ASR Text| LLMProvider[LLM Provider<br/>#40;OpenAI/Langflow#41;]
    LLMProvider --> |Response| LLMService
    LLMService --> |Function Calls| Tools[Tools Service]
    Tools --> |Agora RTM API| Agora[Agora Signaling Service]
    LLMService --> |Response| Server
    Server --> |Response| ConvoAI
    ConvoAI --> |Audio + Text| Client

    subgraph Services
        LLMService
        RagService
        Tools
    end

    subgraph Config
        Utils[Utils/Config]
        ToolDefs[Tool Definitions]
    end

    Services --> Config

For a detailed diagram of the sequence flow, see the Sequence Flow section, and for more information on the entities, see the Component Details and Data Models sections.

Quick Deploy

Heroku	Netlify	Render	Vercel

Each platform requires the appropriate configuration:

Heroku: Uses the app.json file and Procfile
Netlify: Uses the netlify.toml file and the Netlify function in netlify/functions/api.js
Render: Uses the render.yaml file
Vercel: Uses the vercel.json file

Run Locally

Install dependencies:

npm install

Create environment variables file:

cp .env.example .env

Configure the environment variables:

# Agora Configuration
AGORA_APP_ID=your_app_id
AGORA_APP_CERTIFICATE=your_certificate
AGORA_CUSTOMER_ID=your_customer_id
AGORA_CUSTOMER_SECRET=your_customer_secret

# Agent Configuration
AGENT_ID=your_agent_id

# LLM
LLM_PROVIDER=langflow  # options: langflow or openai

#OpenAI
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini # or choose a different model
USE_RESPONSES_API=false # Use OpenAI Responses API instead of Chat Completions

# Langflow Configuration (for Langflow service)
LANGFLOW_URL=https://api.langflow.astra.datastax.com/lf/your-instance
LANGFLOW_API_KEY=your_langflow_api_key
LANGFLOW_FLOW_ID=your_flow_id

# Server Configuration
PORT=3000

Start the server:

npm start

OpenAI Chat Completions & Responses API's

This server supports two different OpenAI API implementations:

Chat Completions API - The standard OpenAI chat completions endpoint
Responses API - OpenAI's new Responses API

For a detailed comparison of the two APIs, see the Open AI's Responses vs Chat Completions page.

You can switch between these APIs using the USE_RESPONSES_API environment variable:

# Use Responses API
USE_RESPONSES_API=true

# Use Chat Completions API
USE_RESPONSES_API=false

Both APIs provide similar functionality but the Responses API offers improved performance because it emits semantic events detailing precisely what changed (e.g., specific text additions), so you can write integrations targeted at specific emitted events (e.g., text changes). Whereas the Chat Completions API continuously appends to the content field as tokens are generated—requiring you to manually track differences between each state.

Build and Run with Docker

Use Docker to run this application:

# Build the Docker image
docker build -t agora-convo-ai-custom-llm .

# Run the container
docker run -p 3000:3000 --env-file .env agora-convo-ai-custom-llm

Docker Compose

You can also use Docker Compose to run the application with all required services:

# Start the services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop the services
docker-compose down

API Endpoints

This microservice is meant to be used as a drop-in with the Agora Convo AI service. It acts as a middleware application that accepts ASR text and processes it before sending it to OpenAI's servers. While there is an exposed chat completion endpoint, you should only need to use it during the initial testing.

GET `/ping`

Returns a simple "pong" message to check the server's health.

Request:

curl http://localhost:3000/ping

Response:

{ "message": "pong" }

POST `/v1/chat/completion`

Handles chat completion requests with optional streaming support.

Request Body:

{
  "messages": [{ "role": "user", "content": "Hello!" }],
  "model": "gpt-4o-mini",
  "stream": false,
  "channel": "default",
  "userId": "user123",
  "appId": "app123"
}

Example Request:

curl -X POST http://localhost:3000/v1/chat/completion \
  -H "Authorization: Bearer <your-llm-api-key>" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}]}'

To test the llm locally we recommend using the ngrok tool to expose your local server to the internet.

ngrok http localhost:3000

This will expose your local server to the internet and you can then use the ngrok url to test the llm.

curl -X POST https://<ngrok-url>/v1/chat/completion \
  -H "Authorization: Bearer <your-llm-api-key>" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}]}'

Response:

Non-streaming: JSON response with completion
Streaming: Server-sent events (SSE) with completion chunks

Sequence Flow

sequenceDiagram
    participant C as Client
    participant CA as Agora Convo AI
    participant ASR as ASR Service
    participant S as Express Server
    participant A as Auth Middleware
    participant O as OpenAI Service
    participant R as RAG Service
    participant T as Tools Service
    participant AI as OpenAI API
    participant AG as Agora RTM

    C->>CA: Stream Audio
    CA->>ASR: Process Audio
    ASR->>CA: Return Text
    CA->>S: POST /chat/completion
    S->>A: Validate Token
    A->>S: Token Valid
    S->>O: Process Chat Completion
    O->>R: Request Context
    R-->>O: Return RAG Data
    O->>AI: Send System Prompt + RAG + ASR Text
    AI-->>O: Return Response
    alt Function Call Required
        O->>T: Execute Function
        T->>AG: Send RTM Message
        AG-->>T: Confirm Message
        T-->>O: Return Result
        O->>AI: Send Updated Context
        AI-->>O: Return Final Response
    end
    O->>S: Return Response
    S->>CA: Send Response
    CA->>C: Stream Audio + Text Response

Component Details

1. Server (server.ts)

Main Express application entry point
Configures middleware (helmet, cors, morgan, json parser)
Mounts chat routes and health check endpoint

2. Chat Completion Router (chatCompletion.ts)

Handles POST requests to /chat/completion
Validates request parameters
Manages both streaming and non-streaming responses

3. Authentication (auth.ts)

Middleware for token-based authentication
Validates Bearer tokens against configuration

4. LLM Services

The application supports multiple LLM providers through dedicated service modules:

OpenAI Services

OpenAI Completions (openaiCompletionsService.ts) - Standard Chat Completions API
OpenAI Responses (openaiResponsesService.ts) - Advanced Responses API with improved streaming

Langflow Service

Langflow Service (langflowService.ts) - DataStax Langflow integration
- Connects to Langflow flows via the DataStax Langflow client
- Maintains session state across conversations
- Supports both streaming and non-streaming responses
- Integrates with function calling mechanism
- Formats responses to match OpenAI Chat Completions API structure

All services provide:

RAG integration through the RAG Service
Function calling capabilities
Streaming and non-streaming response modes
Compatible response formatting

5. RAG Service (ragService.ts)

Provides retrieval augmented generation data
Maintains hardcoded knowledge base
Formats data for system prompts

6. Tools Service (tools.ts)

Implements function calling capabilities
Handles Agora RTM integration
Provides utility functions (sendPhoto, orderSandwich)

7. Tool Definitions (toolDefinitions.ts)

Defines available functions for LLM
Specifies function parameters and schemas

8. Utils (utils.ts)

Manages configuration and environment variables
Validates required settings
Provides centralized config object

Data Models

classDiagram
    class Config {
        +port: number
        +agora: AgoraConfig
        +llm: LLMConfig
        +langflow: LangflowConfig
        +agentId: string
    }

    class AgoraConfig {
        +appId: string
        +appCertificate: string
        +authToken: string
    }

    class LLMConfig {
        +openaiApiKey: string
        +model: string
        +useResponsesApi: boolean
    }

    class LangflowConfig {
        +url: string
        +apiKey: string
        +flowId: string
    }

    class ChatMessage {
        +role: string
        +content: string
        +name?: string
        +function_call?: FunctionCall
    }

    class FunctionDefinition {
        +name: string
        +description: string
        +parameters: FunctionParameter
    }

    class RagData {
        +doc1: string
        +doc2: string
        +doc3: string
        +doc4: string
    }

    Config -- AgoraConfig
    Config -- LLMConfig
    Config -- LangflowConfig

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
netlify/functions		netlify/functions
src		src
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.json		app.json
docker-compose.yml		docker-compose.yml
netlify.toml		netlify.toml
package-lock.json		package-lock.json
package.json		package.json
render.yaml		render.yaml
setup.sh		setup.sh
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agora Conversational AI Custom LLM Node

LLM Provider Support

Langflow Integration

Architecture

Quick Deploy

Run Locally

OpenAI Chat Completions & Responses API's

Build and Run with Docker

Docker Compose

API Endpoints

GET `/ping`

POST `/v1/chat/completion`

Sequence Flow

Component Details

1. Server (server.ts)

2. Chat Completion Router (chatCompletion.ts)

3. Authentication (auth.ts)

4. LLM Services

OpenAI Services

Langflow Service

5. RAG Service (ragService.ts)

6. Tools Service (tools.ts)

7. Tool Definitions (toolDefinitions.ts)

8. Utils (utils.ts)

Data Models

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

AgoraIO-Community/ConvoAI-Custom-LLM-Langflow

Folders and files

Latest commit

History

Repository files navigation

Agora Conversational AI Custom LLM Node

LLM Provider Support

Langflow Integration

Architecture

Quick Deploy

Run Locally

OpenAI Chat Completions & Responses API's

Build and Run with Docker

Docker Compose

API Endpoints

GET /ping

POST /v1/chat/completion

Sequence Flow

Component Details

1. Server (server.ts)

2. Chat Completion Router (chatCompletion.ts)

3. Authentication (auth.ts)

4. LLM Services

OpenAI Services

Langflow Service

5. RAG Service (ragService.ts)

6. Tools Service (tools.ts)

7. Tool Definitions (toolDefinitions.ts)

8. Utils (utils.ts)

Data Models

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

GET `/ping`

POST `/v1/chat/completion`

Packages