The real leap is streaming, not just speech
Most live-translation tools are turn-by-turn: you speak, you stop, the system catches up and replies. Gemini 3.5 Live Translate breaks that rhythm by generating translated speech continuously while you are still talking, balancing the trade-off between waiting for more context to improve quality and translating immediately, and staying just a few seconds behind throughout a session [1]. It is built on Gemini 3 Pro, with a 128K-token audio input window and a 64K-token output window, and crucially it preserves the speaker's intonation, pacing, and pitch rather than flattening everything into a robotic monotone [2]. That combination — continuous output plus prosody preservation — is what makes the result feel like simultaneous interpretation rather than a voice memo on delay.


