GSoC Interest Idea #11 webaudio api! #19998

PranAD-dev · 2026-02-23T08:06:10Z

PranAD-dev
Feb 23, 2026

Hey! I'm interested in contributing to the Hands-Free Multimodal Voice Mode idea for GSoC 2026.

I have past experience working with real-time audio and the webaudio api I built EchoEarth (It's in my github, the sole reason why I am not linking it here is that it would seem like a shameless plug), a spatial audio app where users have live voice conversations with AI-powered ecosystems. It uses Web Audio API (HRTF PannerNode, ConvolverNode, dynamic crossfading), and Gemini for generation. Won best use of ElevenLabs at SFHacks. I loved implementing the Web Audio with HRTF, I have some ideas that I would genuinely love to implement! (I just did the hackathon a week ago, and the overlap was surprising lol).

The overlap with this project is pretty direct and overlaps. My past project addresses the same core problems: real-time audio streaming, voice activity detection, managing conversation state, keeping latency low enough for fluid interaction, and Voice-to-Text for gemini to understand!

I've started exploring the Gemini CLI codebase and I'm working on my first contribution, and had a question, For the voice mode,
is the vision to use Gemini's native Live API for bidirectional audio streaming, and whisper or is it just whisper as stated on the docs!

@bdmorgan

aniruddhaadak80 · 2026-03-09T18:46:22Z

aniruddhaadak80
Mar 9, 2026

From my point of view, if the project goal is truly hands free multimodal voice mode, native Live API style streaming should be the center of the architecture and Whisper style transcription should stay in the fallback category rather than the main path. The real product question is less about audio capture alone and more about interruption, latency, and how voice shares context with the existing agent loop. Your background sounds relevant, but a narrow proof of concept around push to talk and bidirectional session flow would probably make the case strongest.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC Interest Idea #11 webaudio api! #19998

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GSoC Interest Idea #11 webaudio api! #19998

Uh oh!

PranAD-dev Feb 23, 2026

Replies: 1 comment

Uh oh!

aniruddhaadak80 Mar 9, 2026

PranAD-dev
Feb 23, 2026

aniruddhaadak80
Mar 9, 2026