Skip to content

Commit 5ac3d6e

Browse files
author
AWS
committed
Agents for Amazon Bedrock Runtime Update: Add support for computer use tools
1 parent 65b048b commit 5ac3d6e

File tree

2 files changed

+86
-8
lines changed

2 files changed

+86
-8
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"type": "feature",
3+
"category": "Agents for Amazon Bedrock Runtime",
4+
"contributor": "",
5+
"description": "Add support for computer use tools"
6+
}

services/bedrockagentruntime/src/main/resources/codegen-resources/service-2.json

Lines changed: 80 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
{"shape":"AccessDeniedException"},
5353
{"shape":"ServiceQuotaExceededException"}
5454
],
55-
"documentation":"<p>Creates a session to temporarily store conversations for generative AI (GenAI) applications built with open-source frameworks such as LangGraph and LlamaIndex. Sessions enable you to save the state of conversations at checkpoints, with the added security and infrastructure of Amazon Web Services. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/sessions.html\">Store and retrieve conversation history and context with Amazon Bedrock sessions</a>.</p> <p>By default, Amazon Bedrock uses Amazon Web Services-managed keys for session encryption, including session metadata, or you can use your own KMS key. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/session-encryption.html\">Amazon Bedrock session encryption</a>.</p> <note> <p> You use a session to store state and conversation history for generative AI applications built with open-source frameworks. For Amazon Bedrock Agents, the service automatically manages conversation context and associates them with the agent-specific sessionId you specify in the <a href=\"https://docs.aws.amazon.com/bedrock/latest/API_agent-runtime_InvokeAgent.html\">InvokeAgent</a> API operation. </p> </note> <p>Related APIs:</p> <ul> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_ListSessions.html\">ListSessions</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GetSession.html\">GetSession</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_EndSession.html\">EndSession</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_DeleteSession.html\">DeleteSession</a> </p> </li> </ul>",
55+
"documentation":"<p>Creates a session to temporarily store conversations for generative AI (GenAI) applications built with open-source frameworks such as LangGraph and LlamaIndex. Sessions enable you to save the state of conversations at checkpoints, with the added security and infrastructure of Amazon Web Services. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/sessions.html\">Store and retrieve conversation history and context with Amazon Bedrock sessions</a>.</p> <p>By default, Amazon Bedrock uses Amazon Web Services-managed keys for session encryption, including session metadata, or you can use your own KMS key. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/session-encryption.html\">Amazon Bedrock session encryption</a>.</p> <note> <p> You use a session to store state and conversation history for generative AI applications built with open-source frameworks. For Amazon Bedrock Agents, the service automatically manages conversation context and associates them with the agent-specific sessionId you specify in the <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_InvokeAgent.html\">InvokeAgent</a> API operation. </p> </note> <p>Related APIs:</p> <ul> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_ListSessions.html\">ListSessions</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GetSession.html\">GetSession</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_EndSession.html\">EndSession</a> </p> </li> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_DeleteSession.html\">DeleteSession</a> </p> </li> </ul>",
5656
"idempotent":true
5757
},
5858
"DeleteAgentMemory":{
@@ -632,9 +632,27 @@
632632
"type":"string",
633633
"enum":[
634634
"AMAZON.UserInput",
635-
"AMAZON.CodeInterpreter"
635+
"AMAZON.CodeInterpreter",
636+
"ANTHROPIC.Computer",
637+
"ANTHROPIC.Bash",
638+
"ANTHROPIC.TextEditor"
636639
]
637640
},
641+
"ActionGroupSignatureParams":{
642+
"type":"map",
643+
"key":{"shape":"ActionGroupSignatureParamsKeyString"},
644+
"value":{"shape":"ActionGroupSignatureParamsValueString"}
645+
},
646+
"ActionGroupSignatureParamsKeyString":{
647+
"type":"string",
648+
"max":100,
649+
"min":0
650+
},
651+
"ActionGroupSignatureParamsValueString":{
652+
"type":"string",
653+
"max":100,
654+
"min":0
655+
},
638656
"ActionInvocationType":{
639657
"type":"string",
640658
"enum":[
@@ -685,7 +703,11 @@
685703
},
686704
"parentActionGroupSignature":{
687705
"shape":"ActionGroupSignature",
688-
"documentation":"<p> To allow your agent to request the user for additional information when trying to complete a task, set this field to <code>AMAZON.UserInput</code>. You must leave the <code>description</code>, <code>apiSchema</code>, and <code>actionGroupExecutor</code> fields blank for this action group. </p> <p>To allow your agent to generate, run, and troubleshoot code when trying to complete a task, set this field to <code>AMAZON.CodeInterpreter</code>. You must leave the <code>description</code>, <code>apiSchema</code>, and <code>actionGroupExecutor</code> fields blank for this action group.</p> <p>During orchestration, if your agent determines that it needs to invoke an API in an action group, but doesn't have enough information to complete the API request, it will invoke this action group instead and return an <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Observation.html\">Observation</a> reprompting the user for more information.</p>"
706+
"documentation":"<p>Specify a built-in or computer use action for this action group. If you specify a value, you must leave the <code>description</code>, <code>apiSchema</code>, and <code>actionGroupExecutor</code> fields empty for this action group. </p> <ul> <li> <p>To allow your agent to request the user for additional information when trying to complete a task, set this field to <code>AMAZON.UserInput</code>. </p> </li> <li> <p>To allow your agent to generate, run, and troubleshoot code when trying to complete a task, set this field to <code>AMAZON.CodeInterpreter</code>.</p> </li> <li> <p>To allow your agent to use an Anthropic computer use tool, specify one of the following values. </p> <important> <p> Computer use is a new Anthropic Claude model capability (in beta) available with Anthropic Claude 3.7 Sonnet and Claude 3.5 Sonnet v2 only. When operating computer use functionality, we recommend taking additional security precautions, such as executing computer actions in virtual environments with restricted data access and limited internet connectivity. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agent-computer-use.html\">Configure an Amazon Bedrock Agent to complete tasks with computer use tools</a>. </p> </important> <ul> <li> <p> <code>ANTHROPIC.Computer</code> - Gives the agent permission to use the mouse and keyboard and take screenshots.</p> </li> <li> <p> <code>ANTHROPIC.TextEditor</code> - Gives the agent permission to view, create and edit files.</p> </li> <li> <p> <code>ANTHROPIC.Bash</code> - Gives the agent permission to run commands in a bash shell.</p> </li> </ul> </li> </ul>"
707+
},
708+
"parentActionGroupSignatureParams":{
709+
"shape":"ActionGroupSignatureParams",
710+
"documentation":"<p> The configuration settings for a computer use action. </p> <important> <p>Computer use is a new Anthropic Claude model capability (in beta) available with Claude 3.7 Sonnet and Claude 3.5 Sonnet v2 only. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agent-computer-use.html\">Configure an Amazon Bedrock Agent to complete tasks with computer use tools</a>.</p> </important>"
689711
}
690712
},
691713
"documentation":"<p> Contains details of the inline agent's action group. </p>"
@@ -1332,6 +1354,10 @@
13321354
"body":{
13331355
"shape":"String",
13341356
"documentation":"<p>The body of the API response.</p>"
1357+
},
1358+
"images":{
1359+
"shape":"ImageInputs",
1360+
"documentation":"<p>Lists details, including format and source, for the image in the response from the function call. You can specify only one image and the function in the <code>returnControlInvocationResults</code> must be a computer use action. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agent-computer-use.html\">Configure an Amazon Bedrock Agent to complete tasks with computer use tools</a>. </p>"
13351361
}
13361362
},
13371363
"documentation":"<p>Contains the body of the API response.</p> <p>This data type is used in the following API operations:</p> <ul> <li> <p>In the <code>returnControlInvocationResults</code> field of the <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_InvokeAgent.html#API_agent-runtime_InvokeAgent_RequestSyntax\">InvokeAgent request</a> </p> </li> </ul>"
@@ -2350,7 +2376,7 @@
23502376
},
23512377
"responseBody":{
23522378
"shape":"ResponseBody",
2353-
"documentation":"<p>The response from the function call using the parameters. The key of the object is the content type (currently, only <code>TEXT</code> is supported). The response may be returned directly or from the Lambda function.</p>"
2379+
"documentation":"<p>The response from the function call using the parameters. The response might be returned directly or from the Lambda function. Specify <code>TEXT</code> or <code>IMAGES</code>. The key of the object is the content type. You can only specify one type. If you specify <code>IMAGES</code>, you can specify only one image. You can specify images only when the function in the <code>returnControlInvocationResults</code> is a computer use action. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agent-computer-use.html\">Configure an Amazon Bedrock Agent to complete tasks with computer use tools</a>.</p>"
23542380
},
23552381
"responseState":{
23562382
"shape":"ResponseState",
@@ -3055,6 +3081,52 @@
30553081
"webp"
30563082
]
30573083
},
3084+
"ImageInput":{
3085+
"type":"structure",
3086+
"required":[
3087+
"format",
3088+
"source"
3089+
],
3090+
"members":{
3091+
"format":{
3092+
"shape":"ImageInputFormat",
3093+
"documentation":"<p>The type of image in the result.</p>"
3094+
},
3095+
"source":{
3096+
"shape":"ImageInputSource",
3097+
"documentation":"<p>The source of the image in the result.</p>"
3098+
}
3099+
},
3100+
"documentation":"<p>Details about an image in the result from a function in the action group invocation. You can specify images only when the function is a computer use action. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agent-computer-use.html\">Configure an Amazon Bedrock Agent to complete tasks with computer use tools</a>.</p>"
3101+
},
3102+
"ImageInputFormat":{
3103+
"type":"string",
3104+
"enum":[
3105+
"png",
3106+
"jpeg",
3107+
"gif",
3108+
"webp"
3109+
]
3110+
},
3111+
"ImageInputSource":{
3112+
"type":"structure",
3113+
"members":{
3114+
"bytes":{
3115+
"shape":"ImageInputSourceBytesBlob",
3116+
"documentation":"<p> The raw image bytes for the image. If you use an Amazon Web Services SDK, you don't need to encode the image bytes in base64.</p>"
3117+
}
3118+
},
3119+
"documentation":"<p>Details about the source of an input image in the result from a function in the action group invocation.</p>",
3120+
"union":true
3121+
},
3122+
"ImageInputSourceBytesBlob":{
3123+
"type":"blob",
3124+
"min":1
3125+
},
3126+
"ImageInputs":{
3127+
"type":"list",
3128+
"member":{"shape":"ImageInput"}
3129+
},
30583130
"ImageSource":{
30593131
"type":"structure",
30603132
"members":{
@@ -5558,7 +5630,7 @@
55585630
},
55595631
"notEquals":{
55605632
"shape":"FilterAttribute",
5561-
"documentation":"<p>Knowledge base data sources that contain a metadata attribute whose name matches the <code>key</code> and whose value doesn't match the <code>value</code> in this object are returned.</p> <p>The following example would return data sources that don't contain an <code>animal</code> attribute whose value is <code>cat</code>.</p> <p> <code>\"notEquals\": { \"key\": \"animal\", \"value\": \"cat\" }</code> </p>"
5633+
"documentation":"<p>Knowledge base data sources are returned when:</p> <ul> <li> <p>It contains a metadata attribute whose name matches the <code>key</code> and whose value doesn't match the <code>value</code> in this object.</p> </li> <li> <p>The key is not present in the document.</p> </li> </ul> <p>The following example would return data sources that don't contain an <code>animal</code> attribute whose value is <code>cat</code>.</p> <p> <code>\"notEquals\": { \"key\": \"animal\", \"value\": \"cat\" }</code> </p>"
55625634
},
55635635
"notIn":{
55645636
"shape":"FilterAttribute",
@@ -5821,7 +5893,7 @@
58215893
},
58225894
"type":{
58235895
"shape":"RetrieveAndGenerateType",
5824-
"documentation":"<p>The type of resource that contains your data for retrieving information and generating responses.</p> <p>If you choose to use <code>EXTERNAL_SOURCES</code>, then currently only Anthropic Claude 3 Sonnet models for knowledge bases are supported.</p>"
5896+
"documentation":"<p>The type of resource that contains your data for retrieving information and generating responses.</p> <note> <p>If you choose to use <code>EXTERNAL_SOURCES</code>, then currently only Anthropic Claude 3 Sonnet models for knowledge bases are supported.</p> </note>"
58255897
}
58265898
},
58275899
"documentation":"<p>Contains details about the resource being queried.</p> <p>This data type is used in the following API operations:</p> <ul> <li> <p> <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html#API_agent-runtime_RetrieveAndGenerate_RequestSyntax\">RetrieveAndGenerate request</a> – in the <code>retrieveAndGenerateConfiguration</code> field</p> </li> </ul>"
@@ -6321,15 +6393,15 @@
63216393
},
63226394
"promptSessionAttributes":{
63236395
"shape":"PromptSessionAttributesMap",
6324-
"documentation":"<p>Contains attributes that persist across a prompt and the values of those attributes. These attributes replace the $prompt_session_attributes$ placeholder variable in the orchestration prompt template. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-placeholders.html\">Prompt template placeholder variables</a>.</p>"
6396+
"documentation":"<p>Contains attributes that persist across a prompt and the values of those attributes. </p> <ul> <li> <p>In orchestration prompt template, these attributes replace the $prompt_session_attributes$ placeholder variable. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-placeholders.html\">Prompt template placeholder variables</a>.</p> </li> <li> <p>In <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-multi-agent-collaboration.html\">multi-agent collaboration</a>, the <code>promptSessionAttributes</code> will only be used by supervisor agent when $prompt_session_attributes$ is present in prompt template. </p> </li> </ul>"
63256397
},
63266398
"returnControlInvocationResults":{
63276399
"shape":"ReturnControlInvocationResults",
63286400
"documentation":"<p>Contains information about the results from the action group invocation. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-returncontrol.html\">Return control to the agent developer</a> and <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-session-state.html\">Control session context</a>.</p> <note> <p>If you include this field, the <code>inputText</code> field will be ignored.</p> </note>"
63296401
},
63306402
"sessionAttributes":{
63316403
"shape":"SessionAttributesMap",
6332-
"documentation":"<p>Contains attributes that persist across a session and the values of those attributes.</p>"
6404+
"documentation":"<p>Contains attributes that persist across a session and the values of those attributes. If <code>sessionAttributes</code> are passed to a supervisor agent in <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-multi-agent-collaboration.html\">multi-agent collaboration</a>, it will be forwarded to all agent collaborators.</p>"
63336405
}
63346406
},
63356407
"documentation":"<p>Contains parameters that specify various attributes that persist across a session or prompt. You can define session state attributes as key-value pairs when writing a <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html\">Lambda function</a> for an action group or pass them when making an <a href=\"https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_InvokeAgent.html\">InvokeAgent</a> request. Use session state attributes to control and provide conversational context for your agent and to help customize your agent's behavior. For more information, see <a href=\"https://docs.aws.amazon.com/bedrock/latest/userguide/agents-session-state.html\">Control session context</a>.</p>"

0 commit comments

Comments
 (0)