Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/dull-ligers-bow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'firebase': minor
'@firebase/ai': minor
---

Add `sendTextRealtime()`, `sendAudioReatime()`, and `sendVideoRealtime()` to the `LiveSession` class, and deprecate `sendMediaChunks()` and `sendMediaStream()`.
5 changes: 5 additions & 0 deletions common/api-review/ai.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -994,9 +994,14 @@ export class LiveSession {
isClosed: boolean;
receive(): AsyncGenerator<LiveServerContent | LiveServerToolCall | LiveServerToolCallCancellation>;
send(request: string | Array<string | Part>, turnComplete?: boolean): Promise<void>;
sendAudioRealtime(blob: GenerativeContentBlob): Promise<void>;
sendFunctionResponses(functionResponses: FunctionResponse[]): Promise<void>;
// @deprecated (undocumented)
sendMediaChunks(mediaChunks: GenerativeContentBlob[]): Promise<void>;
// @deprecated (undocumented)
sendMediaStream(mediaChunkStream: ReadableStream<GenerativeContentBlob>): Promise<void>;
sendTextRealtime(text: string): Promise<void>;
sendVideoRealtime(blob: GenerativeContentBlob): Promise<void>;
}

// @public
Expand Down
134 changes: 130 additions & 4 deletions docs-devsite/ai.livesession.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,12 @@ export declare class LiveSession
| [close()](./ai.livesession.md#livesessionclose) | | <b><i>(Public Preview)</i></b> Closes this session. All methods on this session will throw an error once this resolves. |
| [receive()](./ai.livesession.md#livesessionreceive) | | <b><i>(Public Preview)</i></b> Yields messages received from the server. This can only be used by one consumer at a time. |
| [send(request, turnComplete)](./ai.livesession.md#livesessionsend) | | <b><i>(Public Preview)</i></b> Sends content to the server. |
| [sendAudioRealtime(blob)](./ai.livesession.md#livesessionsendaudiorealtime) | | <b><i>(Public Preview)</i></b> Sends audio data to the server in realtime. |
| [sendFunctionResponses(functionResponses)](./ai.livesession.md#livesessionsendfunctionresponses) | | <b><i>(Public Preview)</i></b> Sends function responses to the server. |
| [sendMediaChunks(mediaChunks)](./ai.livesession.md#livesessionsendmediachunks) | | <b><i>(Public Preview)</i></b> Sends realtime input to the server. |
| [sendMediaStream(mediaChunkStream)](./ai.livesession.md#livesessionsendmediastream) | | <b><i>(Public Preview)</i></b> Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->. |
| [sendMediaChunks(mediaChunks)](./ai.livesession.md#livesessionsendmediachunks) | | <b><i>(Public Preview)</i></b> |
| [sendMediaStream(mediaChunkStream)](./ai.livesession.md#livesessionsendmediastream) | | <b><i>(Public Preview)</i></b> |
| [sendTextRealtime(text)](./ai.livesession.md#livesessionsendtextrealtime) | | <b><i>(Public Preview)</i></b> Sends text to the server in realtime. |
| [sendVideoRealtime(blob)](./ai.livesession.md#livesessionsendvideorealtime) | | <b><i>(Public Preview)</i></b> Sends video data to the server in realtime. |

## LiveSession.inConversation

Expand Down Expand Up @@ -135,6 +138,45 @@ Promise&lt;void&gt;

If this session has been closed.

## LiveSession.sendAudioRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends audio data to the server in realtime.

The server requires that the audio data is base64-encoded 16-bit PCM at 16kHz little-endian.

<b>Signature:</b>

```typescript
sendAudioRealtime(blob: GenerativeContentBlob): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| blob | [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface) | The base64-encoded PCM data to send to the server in realtime. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
// const pcmData = ... base64-encoded 16-bit PCM at 16kHz little-endian.
const blob = { mimeType: "audio/pcm", data: pcmData };
liveSession.sendAudioRealtime(blob);

```

## LiveSession.sendFunctionResponses()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
Expand Down Expand Up @@ -167,7 +209,12 @@ If this session has been closed.
> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends realtime input to the server.
> Warning: This API is now obsolete.
>
> Use `sendTextRealtime()`<!-- -->, `sendAudioRealtime()`<!-- -->, and `sendVideoRealtime()` instead.
>
> Sends realtime input to the server.
>

<b>Signature:</b>

Expand All @@ -194,7 +241,12 @@ If this session has been closed.
> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->.
> Warning: This API is now obsolete.
>
> Use `sendTextRealtime()`<!-- -->, `sendAudioRealtime()`<!-- -->, and `sendVideoRealtime()` instead.
>
> Sends a stream of [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface)<!-- -->.
>

<b>Signature:</b>

Expand All @@ -216,3 +268,77 @@ Promise&lt;void&gt;

If this session has been closed.

## LiveSession.sendTextRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends text to the server in realtime.

<b>Signature:</b>

```typescript
sendTextRealtime(text: string): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| text | string | The text data to send. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
liveSession.sendTextRealtime("Hello, how are you?");

```

## LiveSession.sendVideoRealtime()

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Sends video data to the server in realtime.

The server requires that the video is sent as individual video frames at 1 FPS. It is recommended to set `mimeType` to `image/jpeg`<!-- -->.

<b>Signature:</b>

```typescript
sendVideoRealtime(blob: GenerativeContentBlob): Promise<void>;
```

#### Parameters

| Parameter | Type | Description |
| --- | --- | --- |
| blob | [GenerativeContentBlob](./ai.generativecontentblob.md#generativecontentblob_interface) | The base64-encoded video data to send to the server in realtime. |

<b>Returns:</b>

Promise&lt;void&gt;

#### Exceptions

If this session has been closed.

### Example


```javascript
// const videoFrame = ... JPEG data
const blob = { mimeType: "image/jpeg", data: videoFrame };
liveSession.sendAudioRealtime(blob);

```

39 changes: 39 additions & 0 deletions packages/ai/integration/live.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,45 @@ describe('Live', function () {
});
});

describe('sendTextRealtime()', () => {
it('should send a single text chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
model: testConfig.model,
generationConfig: textLiveGenerationConfig
});
const session = await model.connect();
const responsePromise = nextTurnText(session.receive());

await session.sendTextRealtime('Are you an AI? Yes or No.');

const responseText = await responsePromise;
expect(responseText).to.include('Yes');

await session.close();
});
});

describe('sendAudioRealtime()', () => {
it('should send a single audio chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
model: testConfig.model,
generationConfig: textLiveGenerationConfig
});
const session = await model.connect();
const responsePromise = nextTurnText(session.receive());

await session.sendAudioRealtime({
data: HELLO_AUDIO_PCM_BASE64, // "Hey, can you hear me?"
mimeType: 'audio/pcm'
});

const responseText = await responsePromise;
expect(responseText).to.include('Yes');

await session.close();
});
});

describe('sendMediaChunks()', () => {
it('should send a single audio chunk and receive a response', async () => {
const model = getLiveGenerativeModel(testConfig.ai, {
Expand Down
6 changes: 3 additions & 3 deletions packages/ai/src/methods/live-session-helpers.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ class MockLiveSession {
isClosed = false;
inConversation = false;
send = sinon.stub();
sendMediaChunks = sinon.stub();
sendAudioRealtime = sinon.stub();
sendFunctionResponses = sinon.stub();
messageGenerator = new MockMessageGenerator();
receive = (): MockMessageGenerator => this.messageGenerator;
Expand Down Expand Up @@ -226,8 +226,8 @@ describe('Audio Conversation Helpers', () => {

await clock.tickAsync(1);

expect(liveSession.sendMediaChunks).to.have.been.calledOnce;
const [sentChunk] = liveSession.sendMediaChunks.getCall(0).args[0];
expect(liveSession.sendAudioRealtime).to.have.been.calledOnce;
const sentChunk = liveSession.sendAudioRealtime.getCall(0).args[0];
expect(sentChunk.mimeType).to.equal('audio/pcm');
expect(sentChunk.data).to.be.a('string');
await controller.stop();
Expand Down
2 changes: 1 addition & 1 deletion packages/ai/src/methods/live-session-helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ export class AudioConversationRunner {
mimeType: 'audio/pcm',
data: base64
};
void this.liveSession.sendMediaChunks([chunk]);
void this.liveSession.sendAudioRealtime(chunk);
};
}

Expand Down
36 changes: 36 additions & 0 deletions packages/ai/src/methods/live-session.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,42 @@ describe('LiveSession', () => {
});
});

describe('sendTextRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const text = 'foo';
await session.sendTextRealtime(text);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { text }
});
});
});

describe('sendAudioRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const blob = { data: 'abcdef', mimeType: 'audio/pcm' };
await session.sendAudioRealtime(blob);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { audio: blob }
});
});
});

describe('sendVideoRealtime()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const blob = { data: 'abcdef', mimeType: 'image/jpeg' };
await session.sendVideoRealtime(blob);
expect(mockHandler.send).to.have.been.calledOnce;
const sentData = JSON.parse(mockHandler.send.getCall(0).args[0]);
expect(sentData).to.deep.equal({
realtimeInput: { video: blob }
});
});
});

describe('sendMediaChunks()', () => {
it('should send a correctly formatted realtimeInput message', async () => {
const chunks = [{ data: 'base64', mimeType: 'audio/webm' }];
Expand Down
Loading
Loading