Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/dull-ligers-bow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'firebase': minor
'@firebase/ai': minor
---

Add `sendTextRealtime()`, `sendAudioReatime()`, and `sendVideoRealtime()` to the `LiveSession` class, and deprecate `sendMediaChunks()` and `sendMediaStream()`.
5 changes: 5 additions & 0 deletions common/api-review/ai.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -994,9 +994,14 @@ export class LiveSession {
isClosed: boolean;
receive(): AsyncGenerator<LiveServerContent | LiveServerToolCall | LiveServerToolCallCancellation>;
send(request: string | Array<string | Part>, turnComplete?: boolean): Promise<void>;
sendAudioRealtime(blob: GenerativeContentBlob): Promise<void>;
sendFunctionResponses(functionResponses: FunctionResponse[]): Promise<void>;
// @deprecated (undocumented)
sendMediaChunks(mediaChunks: GenerativeContentBlob[]): Promise<void>;
// @deprecated (undocumented)
sendMediaStream(mediaChunkStream: ReadableStream<GenerativeContentBlob>): Promise<void>;
sendTextRealtime(text: string): Promise<void>;
sendVideoRealtime(blob: GenerativeContentBlob): Promise<void>;
}

// @public
Expand Down
134 changes: 130 additions & 4 deletions docs-devsite/ai.livesession.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,12 @@ export declare class LiveSession
| [close()](./ai.livesession.md#livesessionclose) | | <b><i>(Public Preview)</i></b> Closes this session. All methods on this session will throw an error once this resolves. |
| [receive()](./ai.livesession.md#livesessionreceive) | | <b><i>(Public Preview)</i></b> Yields messages received from the server. This can only be used by one consumer at a time. |
| [send(request, turnComplete)](./ai.livesession.md#livesessionsend) | | <b><i>(Public Preview)</i></b> Sends content to the server. |
| [sendAudioRealtime(blob)](./ai.livesession.md#livesessionsendaudiorealtime) | | <b><i>(Public Preview)</i></b> Sends audio data to the server in realtime. |
| [sendFunctionResponses(functionResponses)](./ai.livesession.md#livesessionsendfunctionresponses) | | <b><i>(Public Preview)</i></b> Sends function responses to the server. |
| [sendMediaChunks(mediaChunks)](./ai.livesession.md#livesessionsendmediachunks) | | <b><i>(Public Preview)</i></b> Sends realtime input to the server. |
| [sendMediaStream(mediaChunkStream)](./ai.livesession.md#livesessionsendmediastream) | | <b><i>(Public Preview)</i></b> Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->. |
| [sendMediaChunks(mediaChunks)](./ai.livesession.md#livesessionsendmediachunks) | | <b><i>(Public Preview)</i></b> |
| [sendMediaStream(mediaChunkStream)](./ai.livesession.md#livesessionsendmediastream) | | <b><i>(Public Preview)</i></b> |
| [sendTextRealtime(text)](./ai.livesession.md#livesessionsendtextrealtime) | | <b><i>(Public Preview)</i></b> Sends text to the server in realtime. |
| [sendVideoRealtime(blob)](./ai.livesession.md#livesessionsendvideorealtime) | | <b><i>(Public Preview)</i></b> Sends video data to the server in realtime. |

## LiveSession.inConversation

Expand Down Expand Up @@ -135,6 +138,45 @@ Promise&lt;void&gt;

If this session has been closed.

## LiveSession.sendAudioRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends audio data to the server in realtime.

The server requires that the audio data is base64-encoded 16-bit PCM at 16kHz little-endian.

<b>Signature:</b>

```typescript
sendAudioRealtime(blob: GenerativeContentBlob): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| blob | [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface) | The base64-encoded PCM data to send to the server in realtime. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
// const pcmData = ... base64-encoded 16-bit PCM at 16kHz little-endian.
const blob = { mimeType: "audio/pcm", data: pcmData };
liveSession.sendAudioRealtime(blob);

```

## LiveSession.sendFunctionResponses()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
Expand Down Expand Up @@ -167,7 +209,12 @@ If this session has been closed.
> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends realtime input to the server.
> Warning: This API is now obsolete.
>
> Use `sendTextRealtime()`<!-- -->, `sendAudioRealtime()`<!-- -->, and `sendVideoRealtime()` instead.
>
> Sends realtime input to the server.
>

<b>Signature:</b>

Expand All @@ -194,7 +241,12 @@ If this session has been closed.
> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->.
> Warning: This API is now obsolete.
>
> Use `sendTextRealtime()`<!-- -->, `sendAudioRealtime()`<!-- -->, and `sendVideoRealtime()` instead.
>
> Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->.
>

<b>Signature:</b>

Expand All @@ -216,3 +268,77 @@ Promise&lt;void&gt;

If this session has been closed.

## LiveSession.sendTextRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends text to the server in realtime.

<b>Signature:</b>

```typescript
sendTextRealtime(text: string): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| text | string | The text data to send. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
liveSession.sendTextRealtime("Hello, how are you?");

```

## LiveSession.sendVideoRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends video data to the server in realtime.

The server requires that the video is sent as individual video frames at 1 FPS. It is recommended to set `mimeType` to `image/jpeg`<!-- -->.

<b>Signature:</b>

```typescript
sendVideoRealtime(blob: GenerativeContentBlob): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| blob | [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface) | The base64-encoded video data to send to the server in realtime. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
// const videoFrame = ... JPEG data
const blob = { mimeType: "image/jpeg", data: videoFrame };
liveSession.sendAudioRealtime(blob);

```

39 changes: 39 additions & 0 deletions packages/ai/integration/live.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,45 @@ describe('Live', function () {
});
});

describe('sendTextRealtime()', () => {
it('should send a single text chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
model: testConfig.model,
generationConfig: textLiveGenerationConfig
});
const session = await model.connect();
const responsePromise = nextTurnText(session.receive());

await session.sendTextRealtime('Are you an AI? Yes or No.');

const responseText = await responsePromise;
expect(responseText).to.include('Yes');

await session.close();
});
});

describe('sendAudioRealtime()', () => {
it('should send a single audio chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
model: testConfig.model,
generationConfig: textLiveGenerationConfig
});
const session = await model.connect();
const responsePromise = nextTurnText(session.receive());

await session.sendAudioRealtime({
data: HELLO_AUDIO_PCM_BASE64, // "Hey, can you hear me?"
mimeType: 'audio/pcm'
});

const responseText = await responsePromise;
expect(responseText).to.include('Yes');

await session.close();
});
});

describe('sendMediaChunks()', () => {
it('should send a single audio chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
Expand Down
6 changes: 3 additions & 3 deletions packages/ai/src/methods/live-session-helpers.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ class MockLiveSession {
isClosed = false;
inConversation = false;
send = sinon.stub();
sendMediaChunks = sinon.stub();
sendAudioRealtime = sinon.stub();
sendFunctionResponses = sinon.stub();
messageGenerator = new MockMessageGenerator();
receive = (): MockMessageGenerator => this.messageGenerator;
Expand Down Expand Up @@ -226,8 +226,8 @@ describe('Audio Conversation Helpers', () => {

await clock.tickAsync(1);

expect(liveSession.sendMediaChunks).to.have.been.calledOnce;
const [sentChunk] = liveSession.sendMediaChunks.getCall(0).args[0];
expect(liveSession.sendAudioRealtime).to.have.been.calledOnce;
const sentChunk = liveSession.sendAudioRealtime.getCall(0).args[0];
expect(sentChunk.mimeType).to.equal('audio/pcm');
expect(sentChunk.data).to.be.a('string');
await controller.stop();
Expand Down
2 changes: 1 addition & 1 deletion packages/ai/src/methods/live-session-helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ export class AudioConversationRunner {
mimeType: 'audio/pcm',
data: base64
};
void this.liveSession.sendMediaChunks([chunk]);
void this.liveSession.sendAudioRealtime(chunk);
};
}

Expand Down
36 changes: 36 additions & 0 deletions packages/ai/src/methods/live-session.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,42 @@ describe('LiveSession', () => {
});
});

describe('sendTextRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const text = 'foo';
await session.sendTextRealtime(text);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { text }
});
});
});

describe('sendAudioRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const blob = { data: 'abcdef', mimeType: 'audio/pcm' };
await session.sendAudioRealtime(blob);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { audio: blob }
});
});
});

describe('sendVideoRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const blob = { data: 'abcdef', mimeType: 'image/jpeg' };
await session.sendVideoRealtime(blob);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { video: blob }
});
});
});

describe('sendMediaChunks()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const chunks = [{ data: 'base64', mimeType: 'audio/webm' }];
Expand Down
Loading
Loading