Skip to content

Commit 9bb68cf

Browse files
d3xvnNash0x7E2
andauthored
feat: add AWS Bedrock function calling implementation (#120)
* Add AWS Bedrock function calling implementation - Implemented function calling for AWS Bedrock Realtime (Nova Sonic) - Added tool schema conversion to AWS Nova format - Implemented tool execution handlers for realtime API - Added audio resampling fix for simple_audio_response - Created example demonstrating function calling with AWS LLM - Updated README with function calling documentation - Added test for AWS Realtime function calling Note: AWS Nova Realtime toolConfiguration causes connection errors, likely an API limitation. Implementation is ready for when AWS adds support. * Update converse format. Verbose logging * Clean up for LLM * Working realtime function calling (pre cleanup) * Squashed commit of the following: commit d78a4a0 Author: Dan Gusev <[email protected]> Date: Thu Oct 30 20:44:19 2025 +0100 Logging cleanup (#133) * logging: use logging.getLogger(__name__) everywhere to simplify configuration * Clean up logging everywhere - Replaced "logging.info" usages with separate loggers - Lowered some info messages to debug - Replaced prints with logging - Added emojis where they're already used * Enable default logging for the SDK logs - Added a way to set the SDK log level at the Agent class - Set the default formatter and added level-based coloring - If the logs are already configured, they remain intact - Moved logging_utils.py to utils/logging.py * Remove "name" from the default logging formatter --------- Co-authored-by: Thierry Schellenbach <[email protected]> * Lint and formatting after merge * Remove resample_audio in favor of PCM.resample * Migrate RealtimeAudioOutputEvent to _emit_audio_output_event * unused import * Fix linting: remove unused imports * Clean up excessive logging in AWS Bedrock realtime - Reduce connection setup logs from 7+ to 2 INFO messages - Change most event handling logs to DEBUG level instead of INFO - Remove duplicate 'Response processing error' log message - Remove unused _pending_tool_uses dictionary - Remove redundant debug logs in send_raw_event * Fix mypy type errors in AWS LLM plugin - Add type annotation for tool_calls list - Use cast() to properly type NormalizedToolCallItem lists for _dedup_and_execute - Import cast from typing module --------- Co-authored-by: Neevash Ramdial (Nash) <[email protected]>
1 parent d78a4a0 commit 9bb68cf

File tree

8 files changed

+2227
-1769
lines changed

8 files changed

+2227
-1769
lines changed

plugins/aws/README.md

Lines changed: 61 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# AWS Plugin for Vision Agents
22

3-
AWS (Bedrock) LLM integration for Vision Agents framework with support for both standard and realtime interactions. Includes AWS Polly TTS.
3+
AWS (Bedrock) LLM integration for Vision Agents framework with support for both standard and realtime interactions.
44

55
## Installation
66

@@ -20,17 +20,19 @@ agent = Agent(
2020
agent_user=User(name="Friendly AI"),
2121
instructions="Be nice to the user",
2222
llm=aws.LLM(model="qwen.qwen3-32b-v1:0"),
23-
tts=aws.TTS(), # using AWS Polly
23+
tts=cartesia.TTS(),
2424
stt=deepgram.STT(),
2525
turn_detection=smart_turn.TurnDetection(buffer_duration=2.0, confidence_threshold=0.5),
2626
)
2727
```
2828

2929
The full example is available in example/aws_qwen_example.py
3030

31-
Nova sonic audio realtime STS is also supported:
31+
### Realtime Audio Usage
3232

33-
```python
33+
Nova Sonic audio realtime STS is also supported:
34+
35+
```python
3436
agent = Agent(
3537
edge=getstream.Edge(),
3638
agent_user=User(name="Story Teller AI"),
@@ -39,8 +41,61 @@ agent = Agent(
3941
)
4042
```
4143

42-
### Polly TTS Usage
44+
## Function Calling
45+
46+
### Standard LLM (aws.LLM)
47+
48+
The standard LLM implementation **fully supports** function calling. Register functions using the `@llm.register_function` decorator:
49+
50+
```python
51+
from vision_agents.plugins import aws
52+
53+
llm = aws.LLM(
54+
model="qwen.qwen3-32b-v1:0",
55+
region_name="us-east-1"
56+
)
57+
58+
@llm.register_function(
59+
name="get_weather",
60+
description="Get the current weather for a given city"
61+
)
62+
def get_weather(city: str) -> dict:
63+
"""Get weather information for a city."""
64+
return {
65+
"city": city,
66+
"temperature": 72,
67+
"condition": "Sunny"
68+
}
69+
```
4370

71+
### Realtime (aws.Realtime)
72+
73+
The Realtime implementation **fully supports** function calling with AWS Nova Sonic. Register functions using the `@llm.register_function` decorator:
74+
75+
```python
76+
from vision_agents.plugins import aws
77+
78+
llm = aws.Realtime(
79+
model="amazon.nova-sonic-v1:0",
80+
region_name="us-east-1"
81+
)
82+
83+
@llm.register_function(
84+
name="get_weather",
85+
description="Get the current weather for a given city"
86+
)
87+
def get_weather(city: str) -> dict:
88+
"""Get weather information for a city."""
89+
return {
90+
"city": city,
91+
"temperature": 72,
92+
"condition": "Sunny"
93+
}
94+
95+
# The function will be automatically called when the model decides to use it
96+
```
97+
98+
See `example/aws_realtime_function_calling_example.py` for a complete example.
4499

45100
## Running the examples
46101

@@ -53,9 +108,8 @@ STREAM_API_SECRET=your_stream_api_secret_here
53108
AWS_BEARER_TOKEN_BEDROCK=
54109
AWS_ACCESS_KEY_ID=
55110
AWS_SECRET_ACCESS_KEY=
56-
AWS_REGION=us-east-1
57111
58112
FAL_KEY=
59113
CARTESIA_API_KEY=
60114
DEEPGRAM_API_KEY=
61-
```
115+
```
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
import asyncio
2+
import logging
3+
from uuid import uuid4
4+
5+
from dotenv import load_dotenv
6+
7+
from vision_agents.core import User
8+
from vision_agents.core.agents import Agent
9+
from vision_agents.plugins import aws, getstream, cartesia, deepgram
10+
11+
load_dotenv()
12+
13+
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s [call_id=%(call_id)s] %(name)s: %(message)s")
14+
logger = logging.getLogger(__name__)
15+
16+
17+
async def start_agent() -> None:
18+
agent = Agent(
19+
edge=getstream.Edge(),
20+
agent_user=User(name="Weather Bot"),
21+
instructions="You are a helpful weather bot. Use the provided tools to answer questions.",
22+
llm=aws.LLM(
23+
model="anthropic.claude-3-sonnet-20240229-v1:0",
24+
region_name="us-east-1"
25+
26+
),
27+
tts=cartesia.TTS(),
28+
stt=deepgram.STT(),
29+
# turn_detection=smart_turn.TurnDetection(buffer_duration=2.0, confidence_threshold=0.5),
30+
)
31+
32+
# Register custom functions
33+
@agent.llm.register_function(
34+
name="get_weather",
35+
description="Get the current weather for a given city"
36+
)
37+
def get_weather(city: str) -> dict:
38+
"""Get weather information for a city."""
39+
logger.info(f"Tool: get_weather called for city: {city}")
40+
if city.lower() == "boulder":
41+
return {"city": city, "temperature": 72, "condition": "Sunny"}
42+
return {"city": city, "temperature": "unknown", "condition": "unknown"}
43+
44+
@agent.llm.register_function(
45+
name="calculate",
46+
description="Performs a mathematical calculation"
47+
)
48+
def calculate(expression: str) -> dict:
49+
"""Performs a mathematical calculation."""
50+
logger.info(f"Tool: calculate called with expression: {expression}")
51+
try:
52+
result = eval(expression) # DANGER: In a real app, use a safer math evaluator!
53+
return {"expression": expression, "result": result}
54+
except Exception as e:
55+
return {"expression": expression, "error": str(e)}
56+
57+
await agent.create_user()
58+
59+
call = agent.edge.client.video.call("default", str(uuid4()))
60+
await agent.edge.open_demo(call)
61+
62+
with await agent.join(call):
63+
# Give the agent a moment to connect
64+
await asyncio.sleep(5)
65+
66+
# Test function calling with weather
67+
logger.info("Testing weather function...")
68+
await agent.llm.simple_response(
69+
text="What's the weather like in Boulder? Please use the get_weather function."
70+
)
71+
72+
await asyncio.sleep(5)
73+
74+
# Test function calling with calculation
75+
logger.info("Testing calculation function...")
76+
await agent.llm.simple_response(
77+
text="Can you calculate 25 multiplied by 4 using the calculate function?"
78+
)
79+
80+
await asyncio.sleep(5)
81+
82+
# Wait a bit before finishing
83+
await asyncio.sleep(5)
84+
await agent.finish()
85+
86+
87+
if __name__ == "__main__":
88+
asyncio.run(start_agent())
89+
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
import asyncio
2+
import logging
3+
from uuid import uuid4
4+
5+
from dotenv import load_dotenv
6+
7+
from vision_agents.core import User
8+
from vision_agents.core.agents import Agent
9+
from vision_agents.plugins import aws, getstream
10+
11+
load_dotenv()
12+
13+
logging.basicConfig(
14+
level=logging.INFO,
15+
format="%(asctime)s %(levelname)s [call_id=%(call_id)s] %(name)s: %(message)s"
16+
)
17+
logger = logging.getLogger(__name__)
18+
19+
20+
async def start_agent() -> None:
21+
"""Example demonstrating AWS Bedrock realtime with function calling.
22+
23+
This example creates an agent that can call custom functions to get
24+
weather information and perform calculations.
25+
"""
26+
27+
# Create the agent with AWS Bedrock Realtime
28+
agent = Agent(
29+
edge=getstream.Edge(),
30+
agent_user=User(name="Weather Assistant AI"),
31+
instructions="""You are a helpful weather assistant. When users ask about weather,
32+
use the get_weather function to fetch current conditions. You can also help with
33+
simple calculations using the calculate function.""",
34+
llm=aws.Realtime(
35+
model="amazon.nova-sonic-v1:0",
36+
region_name="us-east-1",
37+
),
38+
)
39+
40+
# Register custom functions that the LLM can call
41+
@agent.llm.register_function(
42+
name="get_weather",
43+
description="Get the current weather for a given city"
44+
)
45+
def get_weather(city: str) -> dict:
46+
"""Get weather information for a city.
47+
48+
Args:
49+
city: The name of the city
50+
51+
Returns:
52+
Weather information including temperature and conditions
53+
"""
54+
# This is a mock implementation - in production you'd call a real weather API
55+
weather_data = {
56+
"Boulder": {"temp": 72, "condition": "Sunny", "humidity": 30},
57+
"Seattle": {"temp": 58, "condition": "Rainy", "humidity": 85},
58+
"Miami": {"temp": 85, "condition": "Partly Cloudy", "humidity": 70},
59+
}
60+
61+
city_weather = weather_data.get(city, {"temp": 70, "condition": "Unknown", "humidity": 50})
62+
return {
63+
"city": city,
64+
"temperature": city_weather["temp"],
65+
"condition": city_weather["condition"],
66+
"humidity": city_weather["humidity"],
67+
"unit": "Fahrenheit"
68+
}
69+
70+
@agent.llm.register_function(
71+
name="calculate",
72+
description="Perform a mathematical calculation"
73+
)
74+
def calculate(operation: str, a: float, b: float) -> dict:
75+
"""Perform a calculation.
76+
77+
Args:
78+
operation: The operation to perform (add, subtract, multiply, divide)
79+
a: First number
80+
b: Second number
81+
82+
Returns:
83+
Result of the calculation
84+
"""
85+
operations = {
86+
"add": lambda x, y: x + y,
87+
"subtract": lambda x, y: x - y,
88+
"multiply": lambda x, y: x * y,
89+
"divide": lambda x, y: x / y if y != 0 else None,
90+
}
91+
92+
if operation not in operations:
93+
return {"error": f"Unknown operation: {operation}"}
94+
95+
result = operations[operation](a, b)
96+
if result is None:
97+
return {"error": "Cannot divide by zero"}
98+
99+
return {
100+
"operation": operation,
101+
"a": a,
102+
"b": b,
103+
"result": result
104+
}
105+
106+
# Create and start the agent
107+
await agent.create_user()
108+
109+
call = agent.edge.client.video.call("default", str(uuid4()))
110+
await agent.edge.open_demo(call)
111+
112+
with await agent.join(call):
113+
# Give the agent a moment to connect
114+
await asyncio.sleep(5)
115+
116+
await agent.llm.simple_response(
117+
text="What's the weather like in Boulder? Please use the get_weather function."
118+
)
119+
120+
# Wait for AWS Nova to process the request and call the function
121+
await asyncio.sleep(15)
122+
123+
await agent.finish()
124+
125+
126+
if __name__ == "__main__":
127+
asyncio.run(start_agent())
128+

plugins/aws/example/pyproject.toml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
[project]
2-
name = "gemini-live-realtime-example"
2+
name = "aws-bedrock-realtime-example"
33
version = "0.0.0"
4-
requires-python = ">=3.10"
4+
requires-python = ">=3.12"
55

66
# put only what this example needs
77
dependencies = [
88
"python-dotenv>=1.0",
9-
"vision-agents-plugins-gemini",
9+
"vision-agents-plugins-aws",
1010
"vision-agents-plugins-getstream",
1111
"vision-agents",
12-
"google-genai>=1.33.0",
12+
"boto3>=1.26.0",
1313
"opentelemetry-exporter-otlp>=1.37.0",
1414
"opentelemetry-exporter-prometheus>=0.58b0",
1515
"prometheus-client>=0.23.1",
1616
"opentelemetry-sdk>=1.37.0",
1717
]
1818

1919
[tool.uv.sources]
20-
"vision-agents-plugins-getstream" = {path = "../../../plugins/getstream", editable=true}
21-
"vision-agents-plugins-gemini" = {path = "../../../plugins/gemini", editable=true}
20+
"vision-agents-plugins-getstream" = {path = "../../getstream", editable=true}
21+
"vision-agents-plugins-aws" = {path = "..", editable=true}
2222
"vision-agents" = {path = "../../../agents-core", editable=true}

0 commit comments

Comments
 (0)