-
Notifications
You must be signed in to change notification settings - Fork 970
feat(ai): Add prefer_in_cloud option for inference mode #9236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This change introduces a new InferenceMode option, prefer_in_cloud. When this mode is selected, the SDK will attempt to use the cloud backend first. If the cloud call fails with a network-related error, it will fall back to the on-device model if available.
🦋 Changeset detectedLatest commit: d4e843e The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Size Report 1Affected Products
Test Logs |
Size Analysis Report 1Affected Products
Test Logs |
This commit adds a new test suite to verify that the GenerativeModel's methods correctly dispatch requests to either the on-device or cloud backends based on the selected InferenceMode. It covers generateContent, generateContentStream, and countTokens.
@@ -352,7 +352,8 @@ export type ResponseModality = | |||
export const InferenceMode = { | |||
'PREFER_ON_DEVICE': 'prefer_on_device', | |||
'ONLY_ON_DEVICE': 'only_on_device', | |||
'ONLY_IN_CLOUD': 'only_in_cloud' | |||
'ONLY_IN_CLOUD': 'only_in_cloud', | |||
'PREFER_IN_CLOUD': 'prefer_in_cloud' | |||
} as const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should add reference docs explaining what 'prefer' means here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, will do in next push commit after committing suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it turns out the problem with turning enums into a combination of const variables and types is that any comment you put on the const will not make it into the documentation. Instead you have to put all the documentation on top of the string literal type.
Co-authored-by: Daniel La Rocque <[email protected]>
…rebase-js-sdk into feat/prefer-in-cloud
Co-authored-by: Daniel La Rocque <[email protected]>
…rebase-js-sdk into feat/prefer-in-cloud
Add prefer_in_cloud option - opposite of prefer_on_device.
Note: Used gemini-cli
Had it add tests to generative-model.test.ts instead of on generateContent/generateContentStream/countTokens for all 4
InferenceMode
cases because I think the more of the pipeline it tests, the better it is - a lot of errors come from omitting or incorrectly passing params through multiple functions.Had it extract the logic for on-device vs cloud into a helper function in helpers.ts even though it is only used in generate-content.ts for now, because we'll need to re-use the same logic for countTokens when we implement that (Chrome API should be ready now).