|
226 | 226 | S3_REGION="us-east-1" |
227 | 227 | WEB_UI_GATEWAY_DATABASE_URL="sqlite:////Users/tamimi/sam-bootcamp/data/webui_gateway.db" |
228 | 228 | </code></pre> |
| 229 | +<aside class="special"><p> Note that the specific model identifier that the endpoint expects in the format of <code>provider/model</code> (e.g., <code>openai/gpt-4</code>, <code>anthropic/claude-3-opus-20240229</code>).</p> |
| 230 | +</aside> |
229 | 231 | </li> |
230 | 232 | <li>Run docker compose<pre><code>docker compose up |
231 | 233 | </code></pre> |
|
258 | 260 |
|
259 | 261 | <google-codelab-step label="Adding agents with built-in tools" duration="10"> |
260 | 262 | <p>Now that you have a docker image running the enterprise edition of SAM, lets go ahead and add configuration files.</p> |
| 263 | +<h2 is-upgraded>Basic Agents</h2> |
261 | 264 | <ol type="1"> |
262 | 265 | <li>From a new terminal window, navigate to your configs directory<pre><code>cd sam-bootcamp/configs/agents |
263 | 266 | </code></pre> |
|
353 | 356 | </code></pre> |
354 | 357 | </li> |
355 | 358 | </ul> |
| 359 | +<h2 is-upgraded>Adding a multimedia agent</h2> |
| 360 | +<p>Lets add another multi-modal agent. This agent is capable of generating and processing content in both audio and visual formats. It includes features like text-to-speech with tone-based voice selection, multi-speaker conversations, audio transcription, image generation, and image analysis, while providing detailed guidelines for using these features effectively.</p> |
| 361 | +<p>In the <code>configs/agents</code> directory</p> |
| 362 | +<ol type="1"> |
| 363 | +<li>Add an agent file that leverages internal tools<pre><code>curl https://raw.githubusercontent.com/SolaceLabs/solace-agent-mesh/refs/heads/main/examples/agents/a2a_multimodal_example.yaml -o multimodal.yaml |
| 364 | +</code></pre> |
| 365 | +</li> |
| 366 | +<li>Shared<pre><code>image_describe: &image_description_model |
| 367 | + # This dictionary structure tells ADK to use the LiteLlm wrapper. |
| 368 | + # 'model' uses the specific model identifier your endpoint expects. |
| 369 | + model: ${IMAGE_DESCRIPTION_MODEL_NAME} # Use env var for model name |
| 370 | + # 'api_base' tells LiteLLM where to send the request. |
| 371 | + api_base: ${IMAGE_SERVICE_ENDPOINT} # Use env var for endpoint URL |
| 372 | + # 'api_key' provides authentication. |
| 373 | + api_key: ${IMAGE_SERVICE_API_KEY} # Use env var for API key |
| 374 | + |
| 375 | +audio_transcription: &audio_transcription_model |
| 376 | + # This dictionary structure tells ADK to use the LiteLlm wrapper. |
| 377 | + # 'model' uses the specific model identifier your endpoint expects. |
| 378 | + model: ${AUDIO_TRANSCRIPTION_MODEL_NAME} # Use env var for model name |
| 379 | + # 'api_base' tells LiteLLM where to send the request. |
| 380 | + api_base: ${AUDIO_TRANSCRIPTION_API_BASE} # Use env var for endpoint URL |
| 381 | + # 'api_key' provides authentication. |
| 382 | + api_key: ${AUDIO_TRANSCRIPTION_API_KEY} # Use env var for API key |
| 383 | +</code></pre> |
| 384 | +</li> |
| 385 | +<li>Update yor env file with the following environment variable<pre><code>GEMINI_API_KEY=<token> |
| 386 | +</code></pre> |
| 387 | +<aside class="special"><p> Ask your instructor for a Gemini key or generate one from <a href="aistudio.google.com" target="_blank">aistudio.google.com</a></p> |
| 388 | +</aside> |
| 389 | +</li> |
| 390 | +<li>Restart the enterprise container<pre><code>docker restart sam-ent |
| 391 | +</code></pre> |
| 392 | +</li> |
| 393 | +</ol> |
356 | 394 |
|
357 | 395 |
|
358 | 396 | </google-codelab-step> |
|
0 commit comments