Skip to content

Commit 405b180

Browse files
committed
updated to realtime api
1 parent cfaa1da commit 405b180

File tree

1 file changed

+20
-18
lines changed

1 file changed

+20
-18
lines changed

examples/Context_summarization_with_realtime_api.ipynb

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
"\n",
3131
"\n",
3232
"*Notes:*\n",
33-
"> 1. GPT-4o-Realtime supports a 128k token context window, though in certain use cases, you may notice performance degrade as you stuff more tokens into the context window.\n",
33+
"> 1. gpt-realtime supports a 32k token context window, though in certain use cases, you may notice performance degrade as you stuff more tokens into the context window.\n",
3434
"> 2. Token window = all tokens (words and audio tokens) the model currently keeps in memory for the session.x\n",
3535
"\n",
3636
"### One‑liner install (run in a fresh cell)"
@@ -48,7 +48,7 @@
4848
},
4949
{
5050
"cell_type": "code",
51-
"execution_count": 4,
51+
"execution_count": 1,
5252
"metadata": {},
5353
"outputs": [],
5454
"source": [
@@ -74,7 +74,7 @@
7474
},
7575
{
7676
"cell_type": "code",
77-
"execution_count": 5,
77+
"execution_count": 2,
7878
"metadata": {},
7979
"outputs": [],
8080
"source": [
@@ -96,7 +96,7 @@
9696
"In practice you’ll often see **≈ 10 ×** more tokens for the *same* sentence in audio versus text.\n",
9797
"\n",
9898
"\n",
99-
"* GPT-4o realtime accepts up to **128k tokens** and as the token size increases, instruction adherence can drift.\n",
99+
"* gpt-realtime accepts up to **32k tokens** and as the token size increases, instruction adherence can drift.\n",
100100
"* Every user/assistant turn consumes tokens → the window **only grows**.\n",
101101
"* **Strategy**: Summarise older turns into a single assistant message, keep the last few verbatim turns, and continue.\n",
102102
"\n",
@@ -128,7 +128,7 @@
128128
},
129129
{
130130
"cell_type": "code",
131-
"execution_count": 6,
131+
"execution_count": 3,
132132
"metadata": {},
133133
"outputs": [],
134134
"source": [
@@ -159,7 +159,7 @@
159159
},
160160
{
161161
"cell_type": "code",
162-
"execution_count": 7,
162+
"execution_count": 4,
163163
"metadata": {},
164164
"outputs": [],
165165
"source": [
@@ -196,7 +196,7 @@
196196
},
197197
{
198198
"cell_type": "code",
199-
"execution_count": 8,
199+
"execution_count": 5,
200200
"metadata": {},
201201
"outputs": [],
202202
"source": [
@@ -248,7 +248,7 @@
248248
},
249249
{
250250
"cell_type": "code",
251-
"execution_count": 9,
251+
"execution_count": 6,
252252
"metadata": {},
253253
"outputs": [],
254254
"source": [
@@ -297,11 +297,11 @@
297297
"metadata": {},
298298
"source": [
299299
"### 3.3 Detect When to Summarise\n",
300-
"The Realtime model keeps a **large 128 k‑token window**, but quality can drift long before that limit as you stuff more context into the model.\n",
300+
"The Realtime model keeps a **large 32 k‑token window**, but quality can drift long before that limit as you stuff more context into the model.\n",
301301
"\n",
302302
"Our goal: **auto‑summarise** once the running window nears a safe threshold (default **2 000 tokens** for the notebook), then prune the superseded turns both locally *and* server‑side.\n",
303303
"\n",
304-
"We monitor latest_tokens returned in `response.done`. When it exceeds SUMMARY_TRIGGER and we have more than KEEP_LAST_TURNS, we spin up a background summarisation coroutine.\n",
304+
"We monitor latest_tokens returned in `response.done`. When it exceeds SUMMARY_TRIGGER and we have more than KEEP_LAST_TURNS, we spin up a background summarization coroutine.\n",
305305
"\n",
306306
"We compress everything except the last 2 turns into a single French paragraph, then:\n",
307307
"\n",
@@ -314,7 +314,7 @@
314314
},
315315
{
316316
"cell_type": "code",
317-
"execution_count": 10,
317+
"execution_count": 7,
318318
"metadata": {},
319319
"outputs": [],
320320
"source": [
@@ -343,7 +343,7 @@
343343
},
344344
{
345345
"cell_type": "code",
346-
"execution_count": 11,
346+
"execution_count": 8,
347347
"metadata": {},
348348
"outputs": [],
349349
"source": [
@@ -401,7 +401,7 @@
401401
},
402402
{
403403
"cell_type": "code",
404-
"execution_count": 12,
404+
"execution_count": 9,
405405
"metadata": {},
406406
"outputs": [],
407407
"source": [
@@ -451,7 +451,7 @@
451451
},
452452
{
453453
"cell_type": "code",
454-
"execution_count": 13,
454+
"execution_count": 10,
455455
"metadata": {},
456456
"outputs": [],
457457
"source": [
@@ -466,14 +466,14 @@
466466
},
467467
{
468468
"cell_type": "code",
469-
"execution_count": 14,
469+
"execution_count": 11,
470470
"metadata": {},
471471
"outputs": [],
472472
"source": [
473473
"# --------------------------------------------------------------------------- #\n",
474-
"# 🎤 Realtime session #\n",
474+
"# Realtime session #\n",
475475
"# --------------------------------------------------------------------------- #\n",
476-
"async def realtime_session(model=\"gpt-4o-realtime-preview\", voice=\"shimmer\", enable_playback=True):\n",
476+
"async def realtime_session(model=\"gpt-realtime\", voice=\"shimmer\", enable_playback=True):\n",
477477
" \"\"\"\n",
478478
" Main coroutine: connects to the Realtime endpoint, spawns helper tasks,\n",
479479
" and processes incoming events in a big async‑for loop.\n",
@@ -487,7 +487,7 @@
487487
" # Open the WebSocket connection to the Realtime API #\n",
488488
" # ----------------------------------------------------------------------- #\n",
489489
" url = f\"wss://api.openai.com/v1/realtime?model={model}\"\n",
490-
" headers = {\"Authorization\": f\"Bearer {openai.api_key}\", \"OpenAI-Beta\": \"realtime=v1\"}\n",
490+
" headers = {\"Authorization\": f\"Bearer {openai.api_key}\"}\n",
491491
"\n",
492492
" async with websockets.connect(url, extra_headers=headers, max_size=1 << 24) as ws:\n",
493493
" # ------------------------------------------------------------------- #\n",
@@ -503,6 +503,8 @@
503503
" await ws.send(json.dumps({\n",
504504
" \"type\": \"session.update\",\n",
505505
" \"session\": {\n",
506+
" \"type\": \"realtime\",\n",
507+
" model: \"gpt-realtime\",\n",
506508
" \"voice\": voice,\n",
507509
" \"modalities\": [\"audio\", \"text\"],\n",
508510
" \"input_audio_format\": \"pcm16\",\n",

0 commit comments

Comments
 (0)