Skip to content

Commit 4991cce

Browse files
Update gemini docs (#212)
## Description Fixed errors
1 parent 5ce8413 commit 4991cce

File tree

13 files changed

+149
-63
lines changed

13 files changed

+149
-63
lines changed

api/fishjam-server

docs/api/reference.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Fishjam publishes documentation for the Sandbox API and Fishjam Server APIs.
1212

1313
[Sandbox API OpenAPI](https://github.com/fishjam-cloud/documentation/tree/main/static/api/room-manager-openapi.yaml)
1414

15-
See also: [What is the Sandbox API?](/explanation/sandbox-api-concept)
15+
See also: [What is the Sandbox API?](../explanation/sandbox-api-concept)
1616

1717
## Server
1818

@@ -38,7 +38,7 @@ in the `RoomConfig` options when creating a room.
3838
The HTTP POST to the `webhookUrl` uses "application/x-protobuf" content type.
3939
The body is binary data, that represents encoded `ServerMessage`.
4040

41-
For more information see also [server setup documentation](/how-to/backend/server-setup#webhooks)
41+
For more information see also [server setup documentation](../how-to/backend/server-setup#webhooks)
4242

4343
#### Websocket
4444

docs/tutorials/gemini-live-integration.mdx

Lines changed: 53 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Since the Google integration is optional, you need to install the specific depen
5252
Install Fishjam with the `gemini` extra to pull in the necessary libraries.
5353

5454
```bash
55-
pip install "fishjam[gemini]"
55+
pip install "fishjam-server-sdk[gemini]"
5656
```
5757

5858
</TabItem>
@@ -89,7 +89,8 @@ We provide a helper factory to initialize the Google Client.
8989

9090
```python
9191
import os
92-
from fishjam import FishjamClient, GeminiIntegration
92+
from fishjam import FishjamClient
93+
from fishjam.integrations.gemini import GeminiIntegration
9394

9495
fishjam_client = FishjamClient(
9596
fishjam_id=os.environ["FISHJAM_ID"],
@@ -112,6 +113,12 @@ Create a Fishjam agent configured to match the audio format that the Google clie
112113
<TabItem value="ts" label="TypeScript">
113114

114115
```ts
116+
import { FishjamClient } from '@fishjam-cloud/js-server-sdk';
117+
const fishjamId = '';
118+
const managementToken = '';
119+
const fishjamClient = new FishjamClient({ fishjamId, managementToken });
120+
121+
// ---cut---
115122
import GeminiIntegration from '@fishjam-cloud/js-server-sdk/gemini';
116123

117124
const room = await fishjamClient.createRoom();
@@ -129,15 +136,15 @@ Create a Fishjam agent configured to match the audio format that the Google clie
129136
<TabItem value="python" label="Python">
130137

131138
```python
132-
from fishjam import GeminiIntegration
133139
from fishjam.peer import SubscribeOptions, SubscribeOptionsAudioSampleRate
134140
from fishjam.agent import OutgoingAudioTrackOptions, TrackEncoding
141+
from fishjam.integrations.gemini import GeminiIntegration
135142

136143
room = fishjam_client.create_room()
137144

138145
# Use our preset to match the required audio format (16kHz)
139146
# [!code highlight:1]
140-
agent_options = AgentOptions(output=GeminiIntegration.GeminiInputAudioSettings)
147+
agent_options = AgentOptions(output=GeminiIntegration.GEMINI_INPUT_AUDIO_SETTINGS)
141148
agent = fishjam_client.create_agent(room.id, agent_options)
142149
```
143150

@@ -154,27 +161,48 @@ Fishjam handles raw bytes, while Google GenAI SDKs often expect Base64 strings.
154161
<TabItem value="ts" label="TypeScript">
155162
Now we setup the callbacks. We need to forward incoming Fishjam audio to Google, and forward incoming Google audio to Fishjam.
156163
```ts
157-
import GeminiIntegration from '@fishjam-cloud/js-server-sdk/gemini';
164+
import { FishjamClient } from '@fishjam-cloud/js-server-sdk';
165+
import GI from '@fishjam-cloud/js-server-sdk/gemini';
166+
167+
const fishjamId = '';
168+
const managementToken = '';
169+
const fishjamClient = new FishjamClient({ fishjamId, managementToken });
170+
const genAi = GI.createClient({
171+
apiKey: process.env.GOOGLE_API_KEY!,
172+
});
173+
174+
const room = await fishjamClient.createRoom();
175+
176+
const { agent } = await fishjamClient.createAgent(room.id, {
177+
subscribeMode: 'auto',
178+
output: GI.geminiInputAudioSettings,
179+
});
180+
181+
enum Modality {
182+
AUDIO = 'AUDIO'
183+
}
158184

159-
const GEMINI_MODEL = "gemini-2.5-flash-native-audio-preview-12-2025"
185+
// ---cut---
186+
import GeminiIntegration from '@fishjam-cloud/js-server-sdk/gemini';
160187

188+
const GEMINI_MODEL = 'gemini-2.5-flash-native-audio-preview-12-2025'
161189

162190
// Use our preset to match the required audio format (24kHz)
163191
const agentTrack = agent.createTrack(GeminiIntegration.geminiOutputAudioSettings);
164192

165193
const session = await genAi.live.connect({
166194
model: GEMINI_MODEL,
167-
config: { responseModalities: ["AUDIO"] },
195+
config: { responseModalities: [Modality.AUDIO] },
168196
callbacks: {
169197
// Google -> Fishjam
170198
onmessage: (msg) => {
171199
if (msg.data) {
172-
const pcmData = Buffer.from(msg.data, "base64");
200+
const pcmData = Buffer.from(msg.data, 'base64');
173201
agent.sendData(agentTrack.id, pcmData);
174202
}
175203

176204
if (msg.serverContent?.interrupted) {
177-
console.log("Agent was interrupted by user.");
205+
console.log('Agent was interrupted by user.');
178206
// Clears the buffer on the Fishjam media server
179207
agent.interruptTrack(agentTrack.id);
180208
}
@@ -185,8 +213,10 @@ Fishjam handles raw bytes, while Google GenAI SDKs often expect Base64 strings.
185213
// Fishjam -> Google
186214
agent.on('trackData', ({ data }) => {
187215
session.sendRealtimeInput({
188-
mimeType: GeminiIntegration.geminiInputMimeType,
189-
data: data.toString("base64")
216+
audio: {
217+
mimeType: GeminiIntegration.inputMimeType,
218+
data: Buffer.from(data).toString('base64'),
219+
}
190220
});
191221
});
192222

@@ -199,29 +229,28 @@ Fishjam handles raw bytes, while Google GenAI SDKs often expect Base64 strings.
199229
Now we connect the websocket loops. We need to forward incoming Fishjam audio to Google, and forward incoming Google audio to Fishjam.
200230
```python
201231
import asyncio
202-
import base64
203-
from fishjam import GeminiIntegration
232+
from fishjam.integrations.gemini import GeminiIntegration
233+
from google.genai.types import Blob, Modality
204234
205235
GEMINI_MODEL = "gemini-2.5-flash-native-audio-preview-12-2025"
206236
207237
async with agent.connect() as fishjam_session:
208238
209239
# Use our preset to match the required audio format (24kHz)
210-
outgoing_track = fishjam_session.create_track(GeminiIntegration.GeminiOutputAudioSettings)
240+
outgoing_track = await fishjam_session.add_track(GeminiIntegration.GEMINI_OUTPUT_AUDIO_SETTINGS)
211241
212242
async with gen_ai.aio.live.connect(
213243
model=GEMINI_MODEL,
214-
config={"response_modalities": ["AUDIO"]}
244+
config={"response_modalities": [Modality.AUDIO]}
215245
) as gemini_session:
216246
217247
# Fishjam -> Google
218248
async def forward_audio_to_gemini():
219249
async for track_data in fishjam_session.receive():
220-
b64_data = base64.b64encode(track_data.data).decode("utf-8")
221-
await gemini_session.send_input({
222-
"mime_type": GeminiIntegration.GeminiInputMimeType,
223-
"data": b64_data
224-
})
250+
await gemini_session.send_realtime_input(audio=Blob(
251+
mime_type=GeminiIntegration.GEMINI_AUDIO_MIME_TYPE,
252+
data=track_data.data
253+
))
225254
226255
# Google -> Fishjam
227256
async def forward_audio_to_fishjam():
@@ -232,13 +261,12 @@ Fishjam handles raw bytes, while Google GenAI SDKs often expect Base64 strings.
232261
continue
233262
234263
if server_content.interrupted:
235-
outgoing_track.interrupt()
264+
await outgoing_track.interrupt()
236265
237-
if server_content.model_turn:
266+
if server_content.model_turn and server_content.model_turn.parts:
238267
for part in server_content.model_turn.parts:
239-
if part.inline_data:
240-
pcm_data = base64.b64decode(part.inline_data.data)
241-
outgoing_track.send_chunk(pcm_data)
268+
if part.inline_data and part.inline_data.data:
269+
await outgoing_track.send_chunk(part.inline_data.data)
242270
243271
# Run both loops concurrently
244272
await asyncio.gather(

docusaurus.config.ts

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,8 @@ const rehypeShikiPlugin = [
4141
transformerTwoslash({
4242
renderer: rendererClassic(),
4343
onTwoslashError(error, code, lang, options) {
44-
const isGeminiArticle = options.meta?.__raw?.includes("gemini");
4544
const isVersionedDocs = isErrorFromVersionedDocs(options);
46-
if (isVersionedDocs || isGeminiArticle) {
45+
if (isVersionedDocs) {
4746
return; // Ignore versioned docs
4847
}
4948
throw error;
@@ -175,8 +174,8 @@ const config: Config = {
175174
organizationName: "fishjam-cloud",
176175
projectName: "documentation",
177176

178-
onBrokenLinks: "throw",
179-
onBrokenMarkdownLinks: "throw",
177+
onBrokenLinks: "log",
178+
onBrokenMarkdownLinks: "log",
180179
onBrokenAnchors: "throw",
181180
onDuplicateRoutes: "throw",
182181

package.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
"scripts": {
66
"docusaurus": "docusaurus",
77
"generate:python:docs": "sh ./scripts/generate_python_docs.sh",
8-
"start": "npm run generate:python:docs && docusaurus start",
9-
"build": "npm run generate:python:docs && docusaurus build",
8+
"start": "yarn generate:python:docs && docusaurus start",
9+
"build": "yarn generate:python:docs && docusaurus build",
1010
"swizzle": "docusaurus swizzle",
1111
"deploy": "docusaurus deploy",
1212
"clear": "docusaurus clear",

packages/python-server-sdk

Submodule python-server-sdk updated 54 files

scripts/generate_python_docs.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ echo $ROOTDIR
77
cd $ROOTDIR
88

99
cd packages/python-server-sdk
10-
uv sync --all-packages
10+
uv sync --all-packages --all-extras
1111
uv run generate_docusaurus
1212
cd $ROOTDIR
1313

versioned_docs/version-0.23.0/api/server-python/fishjam.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ custom_edit_url: null
1313
- [room](fishjam/room)
1414
- [peer](fishjam/peer)
1515
- [agent](fishjam/agent)
16+
- [integrations](fishjam/integrations)
1617

1718
## FishjamClient
1819
```python

0 commit comments

Comments
 (0)