Inferencing common data type structure #221
- Improved VideoSDK Inference Gateway
feat: add no_participant_timeout_seconds to auto-end session when no participant joins #224
- When the agent joins a meeting but no participant connects within the configured timeout (default 90s), the session now ends automatically instead of staying connected indefinitely.
- Add no_participant_timeout_seconds param to RoomOptions
agent participant implementation for #215
Force cleanup after jobctx is cleaned up #223

Assets 2

05 Mar 11:07

DeepBhupatkar

1.0.0b1

472e1a1

Agents 1.0.0b1 Pre-release

Pre-release

Agents 1.0.0b1

⚠️ Beta Release

This release introduces the new unified pipeline architecture for the VideoSDK Agents framework.

It is not backward compatible with v0.x (up to v0.0.67).

This beta release is intended for testing the new architecture before the stable 1.0.0 release.

Unified Pipeline Architecture

CascadingPipeline and RealtimePipeline have been replaced with a single Pipeline class.

Developers now configure one pipeline, and the SDK automatically determines the optimal execution mode based on the components provided.

Before (v0.0.67 and earlier)

from videosdk.agents import CascadingPipeline, RealtimePipeline

# Cascading pipeline
pipeline = CascadingPipeline(
    stt=..., 
    llm=..., 
    tts=..., 
    vad=..., 
    turn_detector=...
)

# Realtime pipeline
pipeline = RealtimePipeline(
    llm=OpenAIRealtime(...)
)

After (v1.0.0b1)

from videosdk.agents import Pipeline

# Full voice agent
pipeline = Pipeline(
    stt=..., 
    llm=..., 
    tts=..., 
    vad=..., 
    turn_detector=...
)

# Realtime voice agent
pipeline = Pipeline(
    llm=OpenAIRealtime(...)
)

Flexible Agent Composition

The new Pipeline allows you to build different types of agents depending on your use case.

Transcription Agent

Pipeline(stt=...)

Voice + Chat Agent

Pipeline(stt=..., llm=..., tts=...)

Full Voice Agent with Turn Detection

Pipeline(stt=..., llm=..., tts=..., vad=..., turn_detector=...)

Chatbot (Text Only)

Pipeline(llm=...)

Realtime Voice Agent

Pipeline(llm=OpenAIRealtime(...))

You simply include the components you need, and the SDK handles the rest.

Conversational Flow Removal

The previous Conversational Flow abstraction has been removed.

Instead of defining conversation flows separately, developers can now directly control behavior using Pipeline Hooks.

This gives full flexibility to intercept and modify data at any stage of the pipeline.

Pipeline Hooks

Customize pipeline behavior using:

@pipeline.on(...)

Available hook points include:

stt
tts
llm
vision_frame
user_turn_start
user_turn_end
agent_turn_start
agent_turn_end
content_generated

Hooks allow you to preprocess input, modify outputs, control LLM invocation, and implement custom logic.

These hooks allow you to implement:

custom preprocessing
business logic
LLM routing
speech formatting
pronunciation correction

directly inside the pipeline.

API Changes

Previous (v0.x)	Replacement (v1.0.0b1)
`CascadingPipeline`	Use `Pipeline`
`RealtimePipeline`	Use `Pipeline`
`ConversationalFlow`	Use `Pipeline Hooks`

Instead of defining conversational flows separately, you can now implement custom logic directly using pipeline hooks:

@pipeline.on("stt")
@pipeline.on("llm")
@pipeline.on("tts")

This provides more flexibility to control preprocessing, LLM invocation, and speech synthesis behavior directly within the pipeline.

Other features such as function tools, Agent lifecycle, AgentSession, and WorkerJob continue to work as before.

Migration Guide

Replace:

CascadingPipeline(...)
RealtimePipeline(...)

with:

Pipeline(...)

Constructor arguments remain the same — simply pass your components and the SDK will handle execution automatically.

Assets 2

03 Mar 10:53

DeepBhupatkar

v0.0.67

d45b7c2

Agents v0.0.67

Fix/calculations #216
- Metrics improvements and refinements
fix: prevent duplicate MCP tools by only extending with newly added tools #217
Updates Cambai Plugin #218

Assets 2

26 Feb 13:51

DeepBhupatkar

v0.0.66

9f90027

Agents v0.0.66

deprecated : remove the cli in version >= 0.0.65 #210
Remove unused parameter #205
feat: upgrade Simli AI plugin to SDK v2.0 #206
feat: enhance TTS and STT plugins with additional configuration options #212
- Cartesia,Deepgram, ElevenLabs, Google Plugin Updates.

Assets 2

23 Feb 13:14

DeepBhupatkar

v0.0.65

c4d4964

Agents v0.0.65

feat: add Anam virtual avatar support (#188)
- Add support for Anam Virtual Avatar with VideoSDK AI Agents
chore: update Sarvam plugin (#199)
- Upgrade Sarvam AI plugin to the latest configuration provided by Sarvam (STT, LLM, TTS)
feat: add Camb AI TTS plugin (#200)
- Add support for Camb AI Text-to-Speech (TTS)
fix: TURN-D handling in cascading pipeline (#203)
- Fix issue where missing TURN-D caused unintended agent execution
fix(rnnoise): resolve segfaults and init errors on ARM64/Linux (#202)
- Fix rnnoise build issues on Linux ARM64 and Resolve segmentation faults and initialization errors
- Now supported across all platforms

Assets 2

17 Feb 06:10

DeepBhupatkar

v0.0.64

b36d02a

Agents v0.0.64

fix linux rnnoise issue #186
fix: allow session.say inside tools without interrupting LLM #189
feat : add keywords, keyterm prompting, and language param #193
Feature/krisp deepgram #194
- Add support for Denoise in VideoSDK inference for the providers Krisp, ai-coustics, and Sanas.
- Add Deepgram TTS support in VideoSDK inference.
fix(turn-detector): download model once and use cache #198

Assets 2

Releases: videosdk-live/agents

Agents v0.0.72

Uh oh!

Agents v0.0.71

Uh oh!

Agents v0.0.70

Uh oh!

Agents v0.0.69

Uh oh!

Agents v0.0.68

Uh oh!

Agents 1.0.0b1

Agents 1.0.0b1

Unified Pipeline Architecture

Before (v0.0.67 and earlier)

After (v1.0.0b1)

Flexible Agent Composition

Transcription Agent

Voice + Chat Agent

Full Voice Agent with Turn Detection

Chatbot (Text Only)

Realtime Voice Agent

Conversational Flow Removal

Pipeline Hooks

Hooks allow you to preprocess input, modify outputs, control LLM invocation, and implement custom logic.

API Changes

Migration Guide

Uh oh!

Agents v0.0.67

Uh oh!

Agents v0.0.66

Uh oh!

Agents v0.0.65

Uh oh!

Agents v0.0.64

Uh oh!