-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Suppport computer use and sync with new typespec #36101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for the Computer Use tool (preview) in the Azure AI Agents SDK and synchronizes with an updated TypeSpec specification. The changes introduce comprehensive computer automation capabilities, enabling agents to interact with computer interfaces through various actions like clicks, typing, and screenshots.
Key Changes:
- Added Computer Use tool functionality with support for various computer actions (click, type, scroll, etc.)
- Updated SDK version from 1.2.0-beta.2 to 1.2.0-beta.3
- Added comprehensive sample code demonstrating Computer Use tool usage
Reviewed Changes
Copilot reviewed 18 out of 24 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
tsp-location.yaml | Updated TypeSpec commit reference for new specification |
src/utils/utils.ts | Added createComputerUseTool utility method |
src/models/models.ts | Added extensive computer use types, interfaces, and serialization logic |
src/models/index.ts | Exported new computer use types and interfaces |
src/index.ts | Exported computer use functionality at package level |
src/constants.ts | Updated SDK version to 1.2.0-beta.3 |
src/classic/runs/index.ts | Updated tool output parameter types |
src/api/runs/options.ts | Updated parameter types for structured tool outputs |
src/api/runs/operations.ts | Updated serialization function usage |
src/api/agentsContext.ts | Updated user agent version string |
samples/ | Added comprehensive Computer Use sample code in TypeScript and JavaScript |
review/ai-agents-node.api.md | Updated API surface with Computer Use types |
package.json | Updated package version |
CHANGELOG.md | Added changelog entry for new feature |
|
||
// @public | ||
export interface ComputerUseAction { | ||
type: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use kind
for these types of discriminators in TS - is this autogenerated or hand authored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is generated
} | ||
|
||
// @public | ||
export type ComputerUseActionUnion = ClickAction | DoubleClickAction | DragAction | KeyPressAction | MoveAction | ScreenshotAction | ScrollAction | TypeAction | WaitAction | ComputerUseAction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the base ComputerUseAction
belong here? And if we have to add type
to every object in the union do we even need the base type? What value does it add?
I would say this main type should be ComputerUseAction
(without the union) and the ComputerUseAction
interface can be removed. If each member of the union has a type
or kind
discriminator typescript can narrow the types down without any sort of inheritance hierarchy
// @public | ||
export interface DoubleClickAction extends ComputerUseAction { | ||
type: "double_click"; | ||
x: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not coordinates: CoordinatePoint
?
// @public | ||
export interface MoveAction extends ComputerUseAction { | ||
type: "move"; | ||
x: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use coordinates: CoordinatePoint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted offline, the public API is all auto-generated so going to go ahead and approve then check-in with codegen to see if there's something we can do upstream
Packages impacted by this PR
Issues associated with this PR
Describe the problem that is addressed by this PR
What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?
Are there test cases added in this PR? (If not, why?)
Provide a list of related PRs (if any)
Command used to generate this PR:**(Applicable only to SDK release request PRs)
Checklists