Skip to content

Conversation

hsubox76
Copy link
Contributor

@hsubox76 hsubox76 commented Sep 2, 2025

Add prefer_in_cloud option - opposite of prefer_on_device.

Note: Used gemini-cli

Had it add tests to generative-model.test.ts instead of on generateContent/generateContentStream/countTokens for all 4 InferenceMode cases because I think the more of the pipeline it tests, the better it is - a lot of errors come from omitting or incorrectly passing params through multiple functions.

Had it extract the logic for on-device vs cloud into a helper function in helpers.ts even though it is only used in generate-content.ts for now, because we'll need to re-use the same logic for countTokens when we implement that (Chrome API should be ready now).

This change introduces a new InferenceMode option, prefer_in_cloud. When this mode is selected, the SDK will attempt to use the cloud backend first. If the cloud call fails with a network-related error, it will fall back to the on-device model if available.
Copy link

changeset-bot bot commented Sep 2, 2025

🦋 Changeset detected

Latest commit: d4e843e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@firebase/ai Minor
firebase Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Sep 2, 2025

Size Report 1

Affected Products

  • @firebase/ai

    TypeBase (a4848b4)Merge (d5d4fa2)Diff
    browser61.3 kB62.2 kB+921 B (+1.5%)
    main64.8 kB65.7 kB+921 B (+1.4%)
    module61.3 kB62.2 kB+921 B (+1.5%)
  • firebase

    TypeBase (a4848b4)Merge (d5d4fa2)Diff
    firebase-ai.js48.4 kB49.1 kB+674 B (+1.4%)

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/0YhuvBOJfa.html

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Sep 2, 2025

Size Analysis Report 1

Affected Products

  • @firebase/ai

    • AIError

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.54 kB6.57 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.1 kB+34 B (+0.1%)
    • AIErrorCode

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.54 kB6.62 kB+76 B (+1.2%)
      size-with-ext-deps24.1 kB24.2 kB+79 B (+0.3%)
    • AIModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.20 kB8.24 kB+34 B (+0.4%)
      size-with-ext-deps25.8 kB25.9 kB+34 B (+0.1%)
    • AnyOfSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.86 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • ArraySchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.86 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • Backend

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.54 kB6.57 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.1 kB+34 B (+0.1%)
    • BackendType

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.54 kB6.58 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.1 kB+34 B (+0.1%)
    • BlockReason

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.65 kB6.68 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.3 kB+34 B (+0.1%)
    • BooleanSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.87 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • ChatSession

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size21.0 kB21.6 kB+551 B (+2.6%)
      size-with-ext-deps38.7 kB39.3 kB+563 B (+1.5%)

      Dependency

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      functions

      33 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      chromeAdapterFactory
      constructRequest
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatNewContent
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      34 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      callCloudOrDevice
      chromeAdapterFactory
      constructRequest
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatNewContent
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      + callCloudOrDevice

      variables

      24 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      logger
      name
      responseLineRE
      version

      25 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      errorsCausingFallback
      logger
      name
      responseLineRE
      version

      + errorsCausingFallback

    • FinishReason

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.79 kB6.83 kB+34 B (+0.5%)
      size-with-ext-deps24.4 kB24.4 kB+34 B (+0.1%)
    • FunctionCallingMode

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.60 kB6.63 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • GenerativeModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size24.7 kB25.3 kB+596 B (+2.4%)
      size-with-ext-deps42.5 kB43.1 kB+611 B (+1.4%)

      Dependency

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      functions

      38 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      chromeAdapterFactory
      constructRequest
      countTokens
      countTokensOnCloud
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatGenerateContentInput
      formatNewContent
      formatSystemInstruction
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapCountTokensRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      39 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      callCloudOrDevice
      chromeAdapterFactory
      constructRequest
      countTokens
      countTokensOnCloud
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatGenerateContentInput
      formatNewContent
      formatSystemInstruction
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapCountTokensRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      + callCloudOrDevice

      variables

      24 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      logger
      name
      responseLineRE
      version

      25 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      errorsCausingFallback
      logger
      name
      responseLineRE
      version

      + errorsCausingFallback

    • GoogleAIBackend

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.55 kB6.58 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.2 kB+34 B (+0.1%)
    • HarmBlockMethod

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.60 kB6.64 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • HarmBlockThreshold

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.72 kB6.75 kB+34 B (+0.5%)
      size-with-ext-deps24.3 kB24.3 kB+34 B (+0.1%)
    • HarmCategory

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.79 kB6.83 kB+34 B (+0.5%)
      size-with-ext-deps24.4 kB24.4 kB+34 B (+0.1%)
    • HarmProbability

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.62 kB6.65 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • HarmSeverity

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.78 kB6.82 kB+34 B (+0.5%)
      size-with-ext-deps24.4 kB24.4 kB+34 B (+0.1%)
    • ImagenAspectRatio

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.66 kB6.69 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.3 kB+34 B (+0.1%)
    • ImagenImageFormat

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.82 kB6.86 kB+34 B (+0.5%)
      size-with-ext-deps24.4 kB24.4 kB+34 B (+0.1%)
    • ImagenModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size13.2 kB13.2 kB+34 B (+0.3%)
      size-with-ext-deps30.8 kB30.9 kB+34 B (+0.1%)
    • ImagenPersonFilterLevel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.64 kB6.67 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • ImagenSafetyFilterLevel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.71 kB6.75 kB+34 B (+0.5%)
      size-with-ext-deps24.3 kB24.3 kB+34 B (+0.1%)
    • InferenceMode

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.55 kB6.58 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.1 kB+34 B (+0.1%)
    • IntegerSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.87 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • LiveGenerativeModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size12.6 kB12.7 kB+34 B (+0.3%)
      size-with-ext-deps30.3 kB30.3 kB+34 B (+0.1%)
    • LiveResponseType

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.66 kB6.69 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.3 kB+34 B (+0.1%)
    • LiveSession

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size9.05 kB9.08 kB+34 B (+0.4%)
      size-with-ext-deps26.6 kB26.7 kB+34 B (+0.1%)
    • Modality

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.67 kB6.70 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.3 kB+34 B (+0.1%)
    • NumberSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.87 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • ObjectSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.87 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • POSSIBLE_ROLES

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.59 kB6.63 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • ResponseModality

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.60 kB6.63 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.2 kB+34 B (+0.1%)
    • Schema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.86 kB8.89 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • SchemaType

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.65 kB6.69 kB+34 B (+0.5%)
      size-with-ext-deps24.2 kB24.3 kB+34 B (+0.1%)
    • StringSchema

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size8.87 kB8.90 kB+34 B (+0.4%)
      size-with-ext-deps26.5 kB26.5 kB+34 B (+0.1%)
    • VertexAIBackend

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size6.55 kB6.58 kB+34 B (+0.5%)
      size-with-ext-deps24.1 kB24.2 kB+34 B (+0.1%)
    • getAI

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size7.00 kB7.04 kB+34 B (+0.5%)
      size-with-ext-deps31.7 kB31.7 kB+34 B (+0.1%)
    • getGenerativeModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size25.1 kB25.7 kB+596 B (+2.4%)
      size-with-ext-deps42.8 kB43.5 kB+611 B (+1.4%)

      Dependency

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      functions

      39 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      chromeAdapterFactory
      constructRequest
      countTokens
      countTokensOnCloud
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatGenerateContentInput
      formatNewContent
      formatSystemInstruction
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getGenerativeModel
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapCountTokensRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      40 dependencies

      addHelpers
      aggregateResponses
      assignRoleToPartsAndValidateSendMessageRequest
      callCloudOrDevice
      chromeAdapterFactory
      constructRequest
      countTokens
      countTokensOnCloud
      createEnhancedContentResponse
      decodeInstanceIdentifier
      factory
      formatBlockErrorMessage
      formatGenerateContentInput
      formatNewContent
      formatSystemInstruction
      generateContent
      generateContentOnCloud
      generateContentStream
      generateContentStreamOnCloud
      generateResponseSequence
      getClientHeaders
      getFunctionCalls
      getGenerativeModel
      getHeaders
      getInlineDataParts
      getResponsePromise
      getResponseStream
      getText
      hadBadFinishReason
      hasValidCandidates
      makeRequest
      mapCountTokensRequest
      mapGenerateContentCandidates
      mapGenerateContentRequest
      mapGenerateContentResponse
      mapPromptFeedback
      processGenerateContentResponse
      processStream
      registerAI
      validateChatHistory

      + callCloudOrDevice

      variables

      25 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_HYBRID_IN_CLOUD_MODEL
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      logger
      name
      responseLineRE
      version

      26 dependencies

      AIErrorCode
      AI_TYPE
      Availability
      BackendType
      DEFAULT_API_VERSION
      DEFAULT_DOMAIN
      DEFAULT_FETCH_TIMEOUT_MS
      DEFAULT_HYBRID_IN_CLOUD_MODEL
      DEFAULT_LOCATION
      FinishReason
      HarmSeverity
      InferenceMode
      LANGUAGE_TAG
      PACKAGE_VERSION
      POSSIBLE_ROLES
      SILENT_ERROR
      Task
      VALID_PARTS_PER_ROLE
      VALID_PART_FIELDS
      VALID_PREVIOUS_CONTENT_ROLES
      badFinishReasons
      errorsCausingFallback
      logger
      name
      responseLineRE
      version

      + errorsCausingFallback

    • getImagenModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size13.3 kB13.3 kB+34 B (+0.3%)
      size-with-ext-deps31.0 kB31.0 kB+34 B (+0.1%)
    • getLiveGenerativeModel

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size15.1 kB15.1 kB+34 B (+0.2%)
      size-with-ext-deps32.8 kB32.8 kB+34 B (+0.1%)
    • startAudioConversation

      Size

      TypeBase (a4848b4)Merge (d5d4fa2)Diff
      size12.6 kB12.7 kB+34 B (+0.3%)
      size-with-ext-deps30.5 kB30.5 kB+34 B (+0.1%)

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/dGn62c6Iy7.html

This commit adds a new test suite to verify that the GenerativeModel's methods correctly dispatch requests to either the on-device or cloud backends based on the selected InferenceMode. It covers generateContent, generateContentStream, and countTokens.
@hsubox76 hsubox76 marked this pull request as ready for review September 3, 2025 16:36
@hsubox76 hsubox76 requested review from a team as code owners September 3, 2025 16:36
@@ -352,7 +352,8 @@ export type ResponseModality =
export const InferenceMode = {
'PREFER_ON_DEVICE': 'prefer_on_device',
'ONLY_ON_DEVICE': 'only_on_device',
'ONLY_IN_CLOUD': 'only_in_cloud'
'ONLY_IN_CLOUD': 'only_in_cloud',
'PREFER_IN_CLOUD': 'prefer_in_cloud'
} as const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should add reference docs explaining what 'prefer' means here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, will do in next push commit after committing suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out the problem with turning enums into a combination of const variables and types is that any comment you put on the const will not make it into the documentation. Instead you have to put all the documentation on top of the string literal type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants