You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ADK Live API - Agent Engine Bidi in Agent Engine (#371)
* **Refactors Real-Time Agent:** Renamed the `live_api` agent to `adk_live` and migrated its core logic to use the Agent Development Kit.
* **Enables Bidirectional Streaming:** Integrates with Agent Engine and the Gemini Live API for real-time communication.
* **Updates Frontend & CI/CD:** Aligned the React frontend, build configurations, and CI/CD pipelines to support the new ADK-based agent in Agent Engine, adding new tests and deployment targets.
- Example command for testing the starter pack creation - from the root of the repo run: `uv run agent-starter-pack create myagent-$(date +%s) --output-dir target`
155
155
156
156
### Common Pitfalls
157
157
-**Hardcoded URLs**: Use relative paths for frontend connections
158
158
-**Missing Conditionals**: Wrap agent-specific code in proper `{% if %}` blocks
159
-
-**Dependency Conflicts**: Some agents lack certain extras (e.g., live_api + lint)
159
+
-**Dependency Conflicts**: Some agents lack certain extras (e.g., adk_live + lint)
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,7 +72,7 @@ See [Installation Guide](https://googlecloudplatform.github.io/agent-starter-pac
72
72
|`agentic_rag`| A RAG agent for document retrieval and Q&A. Supporting [Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/enterprise-search-introduction) and [Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview). |
73
73
|`langgraph_base_react`| An agent implementing a base ReAct agent using LangGraph |
74
74
|`crewai_coding_crew`| A multi-agent system implemented with CrewAI created to support coding activities |
75
-
|`live_api`| A real-time multimodal RAG agent powered by Gemini, supporting audio/video/text chat with vector DB-backed responses |
75
+
|`adk_live`| A real-time multimodal RAG agent powered by Gemini, supporting audio/video/text chat with vector DB-backed responses |
76
76
77
77
**More agents are on the way!** We are continuously expanding our [agent library](https://googlecloudplatform.github.io/agent-starter-pack/agents/overview). Have a specific agent type in mind? [Raise an issue as a feature request!](https://github.com/GoogleCloudPlatform/agent-starter-pack/issues/new?labels=enhancement)
Real-time conversational agent built with Google ADK and Gemini's live audio model. Supports audio, video, and text interactions with native tool calling.
-**Python Backend** (in `app/` folder): ADK-powered agent using Gemini's live audio model with native tool calling and deployment support for Cloud Run and Agent Engine
10
+
11
+
-**React Frontend** (in `frontend/` folder): Web console for interacting with the live agent via audio, video, and text
12
+
13
+

14
+
15
+
Once running, click the play button to connect and interact with the agent. Try asking "What's the weather like in San Francisco?" to see tool calling in action.
16
+
17
+
## Additional Resources for Multimodal Live API
18
+
19
+
Explore these resources to learn more about the Multimodal Live API and see examples of its usage:
20
+
21
+
-[Project Pastra](https://github.com/heiko-hotz/gemini-multimodal-live-dev-guide/tree/main): a comprehensive developer guide for the Gemini Multimodal Live API.
22
+
-[Google Cloud Multimodal Live API demos and samples](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/multimodal-live-api): Collection of code samples and demo applications leveraging multimodal live API in Vertex AI
23
+
-[Gemini 2 Cookbook](https://github.com/google-gemini/cookbook/tree/main/gemini-2): Practical examples and tutorials for working with Gemini 2
24
+
-[Multimodal Live API Web Console](https://github.com/google-gemini/multimodal-live-api-web-console): Interactive React-based web interface for testing and experimenting with Gemini Multimodal Live API.
25
+
26
+
## Current Status & Future Work
27
+
28
+
This pattern is under active development. Key areas planned for future enhancement include:
29
+
30
+
***Observability:** Implementing comprehensive monitoring and tracing features.
0 commit comments