What is Gemini 3.5 Live Translate?

Riellvriany Indriawan
Written by

Riellvriany Indriawan

Katelin Teen
Reviewed by

Katelin Teen

Last edited June 16, 2026

Expert Verified
Two people speaking different languages with a live sound wave bridging them, illustrating Gemini 3.5 Live Translate

What is Gemini 3.5 Live Translate?

Gemini 3.5 Live Translate is a speech-to-speech translation model from Google. You speak in one language, and it speaks back in another, in near real time, without you tapping a button between turns. Google describes it as "our latest audio model, delivering near real-time speech-to-speech translation in over 70 languages".

The part that makes people sit up is how natural it sounds. The model "generates smooth, natural-sounding translated speech that preserves the speakers' intonation, pacing and pitch", so the translated voice still rises and falls like the original speaker instead of flattening into a robot read-out. It also detects the language on its own, so you don't have to tell it whether the person across the table is speaking Spanish or Tagalog.

One naming note worth getting straight, because it trips people up: the "Live translate" feature in the Google Translate app actually launched back in August 2025, with a headphone-based beta following in December 2025. What changed in June 2026 is the engine underneath: Google swapped in the new 3.5 Live Translate model. And despite the "3.5" badge, DeepMind's model card says the model is based on Gemini 3 Pro, a dedicated audio model with a 128K-token audio context window, not the smaller Flash tier.

Google's official Gemini 3.5 Live Translate announcement page, as taken from the Keyword blog

How Gemini 3.5 Live Translate works

Most translation apps you've used run a relay race: they convert your speech to text, translate the text, then read the text back out in another voice. That works, but it's why older tools feel stop-start, you have to finish talking, then wait through three handoffs before anything comes out.

Gemini 3.5 Live Translate skips the relay. It uses native audio, meaning a single model takes the raw sound in and produces translated sound out. Because it never throws the audio away to convert it into text first, it can hold on to the acoustic detail, the tone, the pacing, the pitch, that a text pipeline would discard. Transcripts are an optional add-on, not the mechanism.

The second trick is that it translates continuously instead of turn by turn. Rather than waiting for a full sentence, it "generates speech continuously, balancing the trade-off between waiting for context to improve quality and translating immediately to stay in sync with the speaker". That's the difference between a conversation and a walkie-talkie.

How Gemini 3.5 Live Translate replaces the old speech-to-text, translate, text-to-speech relay with one continuous native-audio model
How Gemini 3.5 Live Translate replaces the old speech-to-text, translate, text-to-speech relay with one continuous native-audio model

Under the hood for developers, it runs over the Live API, a stateful WebSocket connection that streams audio both ways. You enable translation by sending a translationConfig with a target language code, then pipe in audio as 16 kHz mono PCM in 100 ms chunks. Audio-only sessions are capped at 15 minutes unless you extend them, and every clip of generated audio carries an imperceptible SynthID watermark so it can be identified as AI-made later. This is the same family of low-latency voice tech behind the broader Gemini assistant, just tuned purely for translation with no tools or chit-chat attached.

Where you can actually use it

Google is shipping 3.5 Live Translate on three separate tracks, and which one matters to you depends entirely on whether you're a traveller, a team, or a builder.

The three ways to use Gemini 3.5 Live Translate: the Google Translate app for consumers, Google Meet for teams, and the Live API for developers
The three ways to use Gemini 3.5 Live Translate: the Google Translate app for consumers, Google Meet for teams, and the Live API for developers

The scale signals behind these are real, too. Google says Grab is testing the model for driver-to-traveller communication across users making over 10 million voice calls a month, which tells you where this is headed: embedded inside other companies' apps, not just a standalone translator.

Gemini 3.5 Live Translate at a glance

DimensionDetail
Modelgemini-3.5-live-translate-preview, based on Gemini 3 Pro
What it doesSpeech-to-speech, audio in / audio out
Languages70+ with auto-detection
LatencyA few seconds behind the speaker
StylePreserves intonation, pacing, pitch
WhereGoogle Translate app, Google Meet, Live API
AvailabilityConsumer rollout; developer + Meet previews
WatermarkSynthID on all audio

What it's actually like to use

This is where the marketing and the reality start to diverge, and it's worth being honest about both, because the gap is the whole story.

On the good side, when it works, it feels different from older translation tools. One enthusiast summed up the appeal after the launch:

Real time speech to speech translation. Over 70 languages. No waiting. No awkward pauses. No robotic stop and start conversations. Just speak naturally and hear the translation almost instantly.

But the same threads are full of people hitting walls. The most consistent complaint is turn-taking: because the model translates continuously, it sometimes doesn't know when you've stopped. A developer who builds real-time interpretation tooling put it bluntly:

first the understanding of what is spoken is not very good [...] Second it doesn't have and end sentence tag so you can talk and never hear the end because it doesn't know you finished speaking only after you start speaking again or finish the session. It could be a good AI but needs more work and refining from Google.

There's also a social-friction ceiling that's easy to overlook in a demo. A tech reviewer testing it in real conversations noted on LinkedIn that it works best when everyone in the room is using the same tool:

Live AI translation sounds perfect until you're actually in a conversation with other people [...] I think it's a bit hard to use in a social scenario unless all participants are using it [...] Multi-person conversations still feel like they're at the edge of progress.

How good is it, really?

Two things are true at once. Google's broader translation upgrades post state-of-the-art text quality on the WMT25 benchmark, and the natural-voice output is a clear step up. But live voice translation across the industry still makes mistakes that text translation wouldn't, and some of them are bad.

A telling example came from someone testing live voice translation in the same Google ecosystem (Google Meet), who A/B'd it against the plain Translate app on a simple travel sentence:

The voices sounded authentic but I was shocked at how inaccurate some of the translations were. Far worse than what even Google Translate is capable of. For example: English speaker: "Are you going to take care of the hotel reservations and flights?" Live translation: "Vas a cuidar de los pescadores y peleas?" ("Are you going to take care of the fishermen and fights?")

Google's own docs are refreshingly upfront about the rough edges, too. Voice replication "can be inconsistent", with voices shifting after long pauses or getting stuck during rapid multi-speaker exchanges, and language detection "struggles with heavy accents, similar languages (e.g., Spanish vs. Portuguese), or rapid language switches". So the honest read: brilliant for casual, forgiving conversations, risky for anything where a wrong word costs you. That distinction matters a lot once you start thinking about it for work.

Live voice translation vs multilingual customer support

Here's the reframe most coverage skips. Gemini 3.5 Live Translate is built for spoken, live conversations, two people talking, a meeting, a phone call. That's a real and useful problem to solve. But it's not the shape of most customer support.

Support is mostly written and asynchronous: tickets, emails, chat messages, help-center questions, often arriving overnight while your team sleeps. A live voice translator doesn't help with a German email sitting in your Zendesk queue, and you'd never want unsupervised, occasionally-wrong voice output speaking on your brand's behalf to a paying customer. The skills barely overlap.

Live voice translation suits real-time spoken conversations, while multilingual support automation suits written tickets and chats across 80+ languages
Live voice translation suits real-time spoken conversations, while multilingual support automation suits written tickets and chats across 80+ languages

If multilingual support is your actual goal, the better category is an AI agent for customer service that reads your help docs and past tickets, drafts replies, and resolves the easy stuff, in whatever language the customer wrote in. That's a conversational AI problem with a human in the loop, not a real-time audio one. It's also where the cost math tends to favour tier-1 deflection over hiring multilingual agents, and where an AI knowledge base chatbot earns its keep. If you're weighing the broader category, our guide to AI for customer service and the rundown of AI customer service software are good next stops.

Try eesel

Gemini 3.5 Live Translate is the right tool when the conversation is happening out loud, live, in the moment. When the conversation is your support inbox, eesel is built for that instead: an AI helpdesk agent that learns from your past tickets and help docs, drafts and resolves support across 80+ languages out of the box, and plugs straight into the helpdesk you already run.

The difference is oversight and scale on written work. One eesel customer, Smava, runs a fully automated agent handling over 100,000 German-language support tickets a month, the kind of always-on, multilingual volume a live voice translator was never meant to touch. You stay in control of what it can answer, and you can ramp autonomy up gradually.

eesel AI helpdesk dashboard overview, where an AI agent drafts and resolves support tickets across 80+ languages
eesel AI helpdesk dashboard overview, where an AI agent drafts and resolves support tickets across 80+ languages

If your "translation" problem is really a multilingual support problem, try eesel and see how much of your queue it can handle before a human ever steps in.

Frequently Asked Questions

What is Gemini 3.5 Live Translate?
Gemini 3.5 Live Translate is Google's audio model for near real-time, speech-to-speech translation across more than 70 languages. Announced on June 9, 2026, it listens to spoken audio and speaks back the translation continuously, while keeping the speaker's intonation and pace. It shows up in the Google Translate app, in Google Meet, and via the Gemini Live API. If your goal is written support rather than live speech, an AI agent for customer service is the closer fit.
Is Gemini 3.5 Live Translate free to use?
For consumers, the Live translate feature is rolling out inside the free Google Translate app on Android and iOS. For developers, it runs through the paid Gemini Live API, which is metered by token usage rather than a flat price. Teams comparing the running cost of voice features against text automation often start with our breakdown of AI customer support cost savings.
How many languages does Gemini 3.5 Live Translate support?
The model automatically detects and translates across 70+ languages. In Google Meet specifically, that's a jump from a previous limit of just five languages, unlocking over 2,000 language combinations in a single meeting. For written channels, tools like an AI knowledge base chatbot can answer in dozens of languages off your existing docs.
How accurate is Gemini 3.5 Live Translate?
It's strong on natural-sounding speech and conversational flow, but early testers report weaker handling of non-English source audio, shaky turn detection, and occasional mistranslations on simple sentences. For business-critical replies, many teams prefer a reviewable text workflow like an AI customer service chatbot over unsupervised live voice. See our take on conversational AI for where each fits.
Can I use Gemini 3.5 Live Translate for customer support?
It can help with live, spoken conversations such as phone calls or video meetings, but most support happens in written tickets and chats that need oversight and accuracy. For that, a dedicated AI for customer service that drafts and resolves tickets in 80+ languages, like eesel, is usually the better answer than live voice translation.

Share this article

Riellvriany Indriawan

Article by

Riellvriany Indriawan

Riell is a designer and writer at eesel AI with about two years of experience researching CX platforms, AI chatbots, and helpdesk software. She combines her design background with a sharp eye for how these tools actually look and feel in practice — making her comparisons unusually visual and user-focused.

Related Posts

All posts →
Illustration contrasting an AI chatbot answering a question with an AI agent connected to Slack, email and ticketing tools
AI

AI agents vs AI chatbots: the real difference and when to use each

AI agents vs AI chatbots: chatbots answer questions, agents take actions and close tickets. Here is the real difference and when to reach for each.

KiraKiraJun 17, 2026
Editorial illustration of Claude Fable 5 working as a long-running autonomous teammate for a support team
AI

What can Claude Fable 5 do? A support leader's guide

Claude Fable 5 is Anthropic's most capable model yet. Here's what it can actually do, and what it still can't do on its own for a customer support team.

KiraKiraJun 17, 2026
Line illustration of a support agent talking to people in different languages through a globe speech bubble
AI for business

AI real-time translation for business: how it actually works in 2026

AI real-time translation for business explained: where companies use it, how it handles support in any language, and what to check before you trust it.

KiraKiraJun 17, 2026
Illustration of a person directing blocks of code that assemble themselves, representing vibe coding
AI

What is vibe coding? A plain-English guide for 2026

Vibe coding means describing what you want to an AI and letting it write the code. Here's what it is, where it came from, the risks, and when to actually use it.

KiraKiraJun 17, 2026
A non-technical person describing an app idea while AI assembles software building blocks
AI

Vibe coding for non-developers: what it actually is and how to use it safely

A plain-English guide to vibe coding for non-developers: what it means, the tools to use, where it breaks, and what's safe to build yourself.

KiraKiraJun 17, 2026
Illustration of scrambled text tokens resolving into clean readable text, representing DiffusionGemma's parallel denoising
AI

What is DiffusionGemma? Google's open-weights diffusion LLM, explained

DiffusionGemma is Google's open-weights text-diffusion model: a 26B Mixture-of-Experts that writes whole blocks of text in parallel for up to 4x faster generation.

KiraKiraJun 17, 2026
Illustration of scattered noise and masked blocks resolving into clean lines of text, with a stopwatch signalling speed
AI

Diffusion-based AI models explained: how they work and why they're suddenly fast

A plain-English guide to diffusion-based AI models: how they differ from autoregressive LLMs, why they generate text 10x faster, and what that means for businesses.

KiraKiraJun 17, 2026
Illustration of Claude Fable 5 working as a long-running autonomous teammate for a business team
AI

Claude Fable 5 for business: what Anthropic's most powerful model actually means for your team

A clear-eyed look at Claude Fable 5 for business: what it costs, where it shines, where it bites, and how to actually put it to work in customer support.

KiraKiraJun 17, 2026
Illustration showing an AI layer connecting to existing help desk platforms
AI

How to add AI to your service desk without replacing it

You don't need to replace Zendesk, Freshdesk, or Gorgias to get AI into your support team. This guide explains how an AI layer connects to your existing help desk and what it can actually do once it's there.

Riellvriany IndriawanRiellvriany IndriawanJun 10, 2026

Ready to hire your AI teammate?

Set up in minutes. No credit card required.

Get started free