Grok Voice Agent Builder pricing: what it actually costs

Q: How does Grok's pricing compare to OpenAI Realtime API and ElevenLabs?

Grok's flat $0.05/min beats ElevenLabs Agents ' effective $0.08/min across every subscription tier. OpenAI's gpt-realtime-2 has no published per-minute rate at all, it's billed per audio token ($32 input / $64 output per 1M tokens), which is why xAI itself estimates OpenAI's real-world cost at $0.10/min or more.

Written by

Kurnia Kharisma Agung Samiadjie

Reviewed by

Katelin Teen

Last edited July 3, 2026

Expert Verified

Illustration of a receipt and pricing dials for xAI's Grok Voice Agent Builder

TL;DR

Grok Voice Agent Builder's headline price is a flat $0.05 per minute of voice, with a provisioned phone number adding $0.01/min on top, so a plain phone call runs $0.06/min before anything else. That already beats ElevenLabs Agents, which works out to an effective $0.08/min across every tier, and it sidesteps OpenAI's gpt-realtime-2, which has no published per-minute rate at all, only per-token audio pricing that most teams have to estimate themselves.

The catch is that the $0.05 sticker never shows up alone on a real bill. Every web search, document lookup, or tool call bills separately on top of the voice minutes, and there's no free tier and no published volume discount. I ran the actual numbers below, a single call, then 200/1,000/5,000 calls a month, so you can see where the real cost lands rather than trusting the headline rate.

If what you're pricing out is phone automation, Grok is genuinely the cheapest flat-rate option on the market right now. If your actual backlog is email and chat tickets, that's a different spend entirely, and it's the one I price out for a living at eesel.

How xAI actually charges you

Pricing pages for voice AI tend to bury the unit you're actually billed on. xAI doesn't: the Voice API section of its pricing page states the billable unit up front, and it's minutes of audio, not tokens, not characters, not "credits."

The xAI developer pricing page showing per-minute Voice API rates

Here's the full rate card, every line item xAI publishes, not just the headline number:

Line item	Price
Realtime voice agent	$0.05 / min ($3.00 / hr)
Phone number (telephony)	+$0.01 / min
Realtime text input	$0.004 / message
Text to speech	$15.00 / 1M characters
Speech to text	$0.10 / hr (REST), $0.20 / hr (streaming)
Web search	$5 / 1,000 calls
X search	$5 / 1,000 calls
Document / collections search (RAG)	$2.50 / 1,000 calls
File attachment search	$10 / 1,000 calls
File storage	$0.025 / GiB / day
Collection storage	$0.10 / GiB / day
Batch API	20-50% off standard token rates (text/language models only)
Priority processing	2x premium over standard rates

Two things jump out once you see the whole table. First, the voice rate and the telephony rate are the only two costs you can budget with total confidence, everything else scales with how the agent actually behaves on a call. Second, there's no line for a monthly platform fee or seat license anywhere on it, Grok's billing is pure consumption, which is unusual for a voice-agent builder and worth noting if you've priced out subscription-tier competitors.

What a real call actually costs

The rate card only tells you the price per unit. What you actually want to know is what a call costs, so I built it out line by line for a realistic 5-minute support call on a provisioned number, one that fires a couple of tool calls to look something up.

Itemized receipt-style breakdown of a five-minute Grok voice agent call, from voice minutes to tool calls to the total

Voice: 5 min × $0.05 = $0.25
Phone number: 5 min × $0.01 = $0.05
Two web searches: 2 × $0.005 = $0.01
One document/collections search: 1 × $0.0025 = $0.0025

That's ~$0.31 for a 5-minute call that actually looks something up, versus the ~$0.30 in voice and telephony alone. In other words, the tool-call gotcha that xAI's fine print warns about is real, but on a single call it's a few cents, not a multiplier. The underlying Grok model also bills its own token usage for the reasoning behind those tool calls ($1.25-$2.50 per 1M tokens depending on the model), which is typically a rounding error next to the per-minute charges unless the agent is reasoning through something unusually long.

Where this compounds is volume, not any single call.

What it costs once you're actually running it

A single call's math is reassuring. A month of calls is where the number starts to matter for a budget conversation.

Step chart showing the monthly Grok voice bill rising across 200, 1,000, and 5,000 calls a month

Assuming a 4-minute average call on a provisioned number ($0.06/min for voice plus telephony), before any tool-call spend:

Monthly call volume	Voice + telephony only
200 calls/month	~$48
1,000 calls/month	~$240
5,000 calls/month	~$1,200

Add a light layer of tool calls (a couple of lookups per call, per the worked example above) and each figure moves up by roughly 5-6%, so the 5,000-call tier lands closer to $1,270/month rather than $1,200. There's no volume discount published anywhere on the pricing page at any of these tiers, the per-minute rate stays flat whether you're doing 200 calls or 50,000. That's a fair trade for predictability, but it does mean scale doesn't earn you a better rate the way some enterprise-tier competitors quietly offer.

How the price compares to OpenAI and ElevenLabs

The reason xAI leads with "$0.05/min, industry-leading" is that most competing voice stacks don't publish a comparably clean number.

Bar chart comparing per-minute voice AI cost across Grok, ElevenLabs Agents, and OpenAI Realtime

Provider	Billing unit	Effective rate
Grok Voice Agent	Per minute (flat)	$0.05/min
ElevenLabs Agents	Per minute (subscription tiers)	~$0.08/min, consistent from the $6 Starter plan (75 min) up to the $990 Business plan (12,375 min)
ElevenLabs API (pay-per-use)	Per minute	$0.08/min speech engine rate
OpenAI gpt-realtime-2	Per token	$32/1M input, $64/1M output audio tokens, no blended per-minute rate published

ElevenLabs is the closer of the two comparisons because it also publishes a real per-minute number, and every one of its subscription tiers, from the $6 Starter plan's 75 minutes to the $990 Business plan's 12,375 minutes, works out to almost exactly $0.08 per minute. Grok undercuts that by 37.5% on the base voice rate alone.

OpenAI is the harder one to pin down. Its current pricing page bills gpt-realtime-2 by audio tokens, not minutes, and doesn't publish a blended per-minute estimate for the model at all, only two narrower models (gpt-realtime-translate at $0.034/min and gpt-realtime-whisper at $0.017/min) get a per-minute figure. That's exactly why xAI's own comparison leans on an estimate rather than a quote: xAI states that "$0.10/min is a highly conservative blended estimate. In production, pricing typically exceeds $0.10/min." That's xAI's number about a competitor, not OpenAI's own published rate, so treat it as directional, but the fact that OpenAI doesn't publish a comparable per-minute figure at all is itself telling.

Developers who've actually run the comparison aren't just talking about latency. One of the more measured takes from the community, after Grok topped the Big Bench Audio leaderboard, was a reminder that price and speed are still moving targets:

"Good job on getting it to #1 in the benchmark but cost and speed needs work. Though I expect xAI to deliver on cost and speed soon enough."
vasilenko93, r/singularity

The gotchas to budget for

Three things the headline rate doesn't tell you, worth checking before you commit a budget line to this:

No free tier. xAI's pricing page lists paid rates only. You get one free phone number to test with and browser-based testing, but there's no usage allotment to prototype against before you're billed.
No published volume discount. The per-minute rate is identical at 200 calls and 50,000 calls a month, per the worked table above. If you're negotiating enterprise volume, that conversation happens off the public page.
Tool calls scale with agent behavior, not call count. An agent configured to search aggressively on every turn racks up $5-$10 per 1,000 calls in tool fees regardless of how short the calls are, so the real lever on your bill is how the agent is prompted and configured, not just how many minutes it talks.

If your backlog is text, not phone calls

Everything above is about phone and voice minutes, which is a genuinely different budget line than the one most support teams are actually trying to shrink: the pile of email and chat tickets sitting in Zendesk or Freshdesk. That's the queue I build automation for at eesel, and the pricing comparison there isn't per-minute at all, it's per resolved ticket.

eesel AI pricing page showing usage-based, per-ticket pricing

eesel's pricing is $0.40 per resolved conversation, no per-seat fee, no platform minimum, with the first $50 of usage free to test against your own tickets. It plugs into your existing helpdesk in minutes, learns from tickets you've already solved, and runs a simulation against your ticket history before it ever answers a live customer, which is the same "don't let it act on a guess" problem that voice-agent builders like Grok's are still working through. Smava runs a fully automated agent on 100,000+ tickets a month at that rate, and Gridwise resolved 73% of tier-1 requests in its first month.

Match the spend to the channel: Grok Voice for a phone line, an AI helpdesk agent for the ticket queue. You can try eesel free before deciding either way.

Frequently Asked Questions

How much does Grok Voice Agent Builder cost per minute?

The underlying Grok Voice Agent API is billed at a flat $0.05 per minute of audio, with voices included. Add $0.01 per minute if you use a provisioned phone number, so a fully-telephony call runs $0.06 per minute before any tool calls.

Is there a free tier for Grok Voice Agent Builder?

No. The xAI pricing page lists paid per-minute, per-token, and per-call rates only, with no free usage allotment. Each account does get one free phone number to test with, and you can test an agent in the browser before deploying it.

How does Grok's pricing compare to OpenAI Realtime API and ElevenLabs?

Grok's flat $0.05/min beats ElevenLabs Agents' effective $0.08/min across every subscription tier. OpenAI's gpt-realtime-2 has no published per-minute rate at all, it's billed per audio token ($32 input / $64 output per 1M tokens), which is why xAI itself estimates OpenAI's real-world cost at $0.10/min or more.

What else gets billed besides the per-minute voice rate?

Server-side tools bill separately: web and X search at $5 per 1,000 calls, document/RAG search at $2.50 per 1,000 calls, plus the underlying token cost of whichever Grok model reasons through the call. None of these show up in the $0.05 headline, so a chatty agent that searches constantly costs more than the sticker price.

Is Grok Voice Agent Builder cheaper than automating support with eesel?

They're not really the same purchase. Grok prices phone and voice minutes; eesel prices resolved email and chat tickets at $0.40 each with no per-seat fee. If your backlog is text-based rather than a phone queue, an AI helpdesk agent is the more directly comparable spend.