
What xAI actually sells
xAI is Elon Musk's AI research company, founded in 2023 and best known for the Grok model family. The company handles over 1 million API calls per day with sub-200ms latency on its self-described Colossus infrastructure, and community projections put its standalone revenue at $500 million in 2025 growing toward $2 billion in 2026 - though those figures are community-reported estimates, not audited numbers.
The product has two modes: a consumer chatbot (Grok) and a developer API. The pricing for each is completely different, which is the first thing that trips people up.
Consumer plans: Free and SuperGrok
The consumer product lives at grok.com and on iOS and Android. Both free and paid tiers include the same core features: Grok chat, real-time web and X/Twitter integration, image generation, video generation (up to 15 seconds at 720p), voice conversations, and file analysis. The list is surprisingly long for a free tier.
What SuperGrok adds is headroom and capability. At approximately $30/month (also included with X Premium+), you get:
- Higher daily limits across all feature categories
- Priority access during peak hours when the free tier queues
- Multi-agent mode - the biggest differentiator. Multiple agents tackle sub-problems in parallel, each shows its reasoning chain, and results are merged into one cited answer
- Grok Build Beta for coding automations and plan-mode workflows
The free plan is generous enough for occasional use. The question is whether you hit the limits. If you're a developer testing the models, the consumer free tier runs dry quickly and the API is almost always the better path.

| Feature | Free | SuperGrok (~$30/mo) |
|---|---|---|
| Chat | Yes | Yes |
| Real-time web + X search | Yes | Yes |
| Image generation | Yes | Yes |
| Video generation (up to 15s, 720p) | Yes | Yes |
| Voice conversations | Yes | Yes |
| File and PDF analysis | Yes | Yes |
| Multi-agent mode | No | Yes |
| Higher daily limits | No | Yes |
| Priority access | No | Yes |
| Grok Build Beta | No | Yes |
xAI API pricing: the full breakdown
The xAI API is billed across five distinct categories: text/reasoning, images, video, voice, and tools. Each is metered separately. Understanding the full picture matters because a realistic workload that combines text responses with web search and file retrieval can cost two to three times more than the token price alone suggests.

Chat and reasoning: grok-4.3 and grok-build-0.1
The two current text models are grok-4.3 (the flagship, with reasoning) and grok-build-0.1 (the coding model, launched May 29, 2026).
| Model | Context | Input | Cached input | Output |
|---|---|---|---|---|
| grok-4.3 | 1M tokens | $1.25/1M | $0.20/1M | $2.50/1M |
| grok-4.20-multi-agent-0309 | 1M tokens | $1.25/1M | $0.20/1M | $2.50/1M |
| grok-4.20-0309-reasoning | 1M tokens | $1.25/1M | $0.20/1M | $2.50/1M |
| grok-4.20-0309-non-reasoning | 1M tokens | $1.25/1M | $0.20/1M | $2.50/1M |
| grok-build-0.1 | 256k tokens | $1.00/1M | $0.20/1M | $2.00/1M |
grok-4.3 supports a configurable reasoning_effort parameter so you can dial reasoning up or down without switching models. This matters for cost control - a community developer noted on r/singularity that even with reasoning_effort set to low, the model could still spike from 1,500 to 10,000 thinking tokens unexpectedly. Budget with some headroom.
grok-build-0.1 is specifically trained for agentic coding workflows and has a 256k context window - shorter than the flagship's 1M but still substantial. It's currently in early access.

One thing the API docs mention clearly: requests that violate xAI usage guidelines are still charged. If a Responses API call gets caught for a violation before generation, you're billed $0.05 per request regardless. It's a small number but worth knowing if you're running high-volume automated workflows.
Batch API: the 20-50% discount
The xAI Batch API reduces token costs by 20% to 50% on text and reasoning models. The trade-off: responses arrive within 24 hours rather than real-time. Compared to OpenAI's Batch API which offers a 50% flat discount, xAI's variable 20-50% range is worth testing on your specific workload - the actual discount depends on load.
If you're running batch inference, classification, or document processing pipelines where latency doesn't matter, this is the most direct way to cut costs. Image and video generation doesn't qualify for the batch discount and remains at standard rates.
Grok Imagine: images and video
The Imagine API is metered by output size and quality tier.
| Model | Description | Input | Output |
|---|---|---|---|
| grok-imagine-image | Text/image → image, standard | $0.002/img | $0.02/img (1K or 2K) |
| grok-imagine-image-quality | Text/image → image, high quality | $0.01/img | $0.05/img (1K), $0.07/img (2K) |
| grok-imagine-video | Text/image/video → video | $0.01/sec + $0.002/img | $0.05/sec (480p), $0.07/sec (720p) |
| grok-imagine-video-1.5-preview | Image → video (preview) | $0.01/img | $0.08/sec (480p), $0.14/sec (720p) |
A standard 1K image via the basic Imagine model costs $0.02. A 15-second 720p video via the 1.5 preview model costs $2.10 in output charges plus the input. These are competitive rates compared to Midjourney and similar services, but they add up fast in production workflows where you generate at scale.

Grok Voice API
The Voice API is where xAI's pricing gets genuinely unusual - and where the product has a real claim. Artificial Analysis named the Grok Voice Agent the leading speech reasoning model as of early 2026, ahead of Google and Amazon's native audio models.
| Mode | Cost |
|---|---|
| Realtime agent | $3.00/hour |
| Realtime text input | $0.004/message |
| Text to speech (TTS) | $15.00/1M characters |
| Speech to text, REST | $0.10/hour |
| Speech to text, streaming | $0.20/hour |
The $3/hour realtime rate is how most applications will hit this line item. At that rate, 1,000 hours of voice conversations costs $3,000 - factor that in before building voice-first features. Compare it against OpenAI's Realtime API when finalizing your architecture choice.

Tool calls: the biggest hidden cost
This is where realistic xAI API bills diverge sharply from the token-only estimate. Server-side tools are billed per call, on top of token costs.
| Tool | API name | Cost |
|---|---|---|
| Web search | web_search | $5.00/1k calls |
| X search | x_search | $5.00/1k calls |
| Code execution | code_execution / code_interpreter | $5.00/1k calls |
| File attachments | attachment_search | $10.00/1k calls |
| Collections search (RAG) | collections_search / file_search | $2.50/1k calls |
| Image understanding | view_image | Token-based only |
If your agent runs a web search on every turn, that's $5 per 1,000 requests on top of whatever the tokens cost. A 10,000-request workload with web search enabled at every turn adds $50 to the bill before any token or storage charges. The community noted this clearly: "Grok is magnitudes cheaper and bypasses X API data pulling limits" - the X search integration is genuinely differentiated, but it's not free.
Storage
Files and collections (RAG) are billed per GiB per day.
| Resource | Rate |
|---|---|
| File storage | $0.025/GiB/day |
| Collection storage | $0.10/GiB/day |
| File downloads | $0.20/GiB |
| Collection downloads | $0.20/GiB |
Collections (vector search) cost four times more than raw file storage to maintain. If you're building a RAG pipeline on xAI, this is worth projecting out. 100 GiB of collection storage runs $10/day or $300/month in storage charges alone - before any query costs.
The pricing history: how we got here
xAI's pricing arc is worth knowing because it shapes how to read the current numbers.

Grok 4 launched in July 2025 at $3.00/1M input and $15.00/1M output - comparable to Claude Sonnet at the time, but pricier than commodity alternatives. xAI then introduced Grok 4 Fast at a dramatically cheaper $0.20/1M input and $0.50/1M output, which made the API genuinely competitive for price-sensitive workloads.
"Grok-4-1-fast-reasoning is $0.20 for input. grok-4.3 is $1.20. I'm only using Grok as a visual processor, so the cost increase won't hit that hard, but still - this feels like an economic decision as much as a 'force users onto the latest platform' decision."
-- u/slickriptide on r/MyBoyfriendIsAI
In May 2026, xAI retired eight models including Grok 3, Grok 4 Fast, grok-4-1-fast-reasoning, and grok-code-fast-1. The effect: users who had built on the budget fast models had to migrate to grok-4.3, which costs 6x more per input token. The current $1.25/$2.50 pricing represents a 58% reduction from the original Grok 4 launch price, but a sharp step up for anyone relying on the fast-tier alternatives that no longer exist.
The other significant change: xAI ended its "data-sharing program" in May 2025, which had provided $150/month in free API credits. There is no longer a free API tier for new developers.
Hidden costs to budget for
A few things that don't appear obviously in the headline token rates:
Reasoning token spikes. grok-4.3 uses a configurable reasoning effort parameter, but "low" effort doesn't guarantee low token counts. One developer on r/singularity reported the model jumping from an average of 1,500 thinking tokens to 10,000 on the same prompt without explanation. If you're setting cost caps, build in at least 3-4x headroom over your expected reasoning token baseline.
Model retirement windows. Eight models were retired with short notice in May 2026. If you pin to a versioned model name (e.g. grok-4-0709) you get stability until the retirement date; if you use the alias (grok-4.3) you auto-migrate but may hit unexpected capability changes. Neither is obviously safer - one gives you certainty of sunset, the other gives you continuity until it doesn't.
Tool calls on every turn. The API is OpenAI-compatible, which makes migration easy, but OpenAI's tools aren't billed the same way. web_search at $5/1k calls is a new line item for teams switching from a tool-calling setup that didn't previously charge per call. Audit your average tool calls per session before projecting total cost.
Usage guideline violations. A $0.05 fee applies to any request caught violating usage guidelines before generation. For most legitimate workloads this is never triggered, but for content moderation pipelines or adversarial testing, it's worth noting.
Who should pay for what
Free Grok: Works well for individuals using Grok as a research tool, writing assistant, or casual question-answering. The multi-modal feature set is generous at zero cost. You'll hit the daily limits if you use it heavily.
SuperGrok (~$30/mo): Worth it if you rely on multi-agent mode for research tasks, hit the free tier's daily limits regularly, or want Grok Build for coding automations. At $30 flat it's comparable to Claude Pro pricing and ChatGPT pricing at their respective Plus tiers.
xAI API: The right choice for developers building applications. The token prices are competitive, the OpenAI SDK compatibility means low migration friction, and the X search integration is genuinely unique if you're building anything that needs real-time social data. Watch the tool and storage charges; they're where the bill actually lives for agent-based workloads.
Batch API: The obvious choice for any async workload - document processing, bulk classification, offline evaluation. The 20-50% discount is meaningful at volume.
Azure/Oracle/Google Cloud: If your infrastructure is already in one of these clouds and you want Grok without a separate vendor relationship, the cloud marketplace routes work. Pricing varies by provider and is worth comparing against xAI's direct rates before committing. The OpenAI models list and Qwen pricing pages are worth reading alongside this to compare what different API providers charge for frontier-class models right now.
Try eesel
If you're evaluating AI APIs to power your support or knowledge workflows, eesel is worth looking at alongside the raw model pricing. eesel deploys autonomous AI agents directly inside your existing tools - Zendesk, Slack, Freshdesk, Shopify, and 100+ others - without requiring you to wire together LLM APIs, tool calls, and storage billing yourself. Pricing is task-based: $0.40 per regular task (ticket, chat reply) with a free $50 credit to start. There's no platform fee on self-serve, no seat costs, and agents pause at your spend cap. For teams that want AI resolution without the per-token accounting, eesel's pricing is the cleaner comparison to SuperGrok than the raw API rates.
Frequently Asked Questions
Share this article

Article by
Kira
A Computer Science student deeply passionate in the fields of UI/UX Design and Web Development with a knack on writing. Fusing technical expertise with a creative flair, I'm driven to craft innovative and user-centric solutions, leveraging both coding proficiency and design sensibilities to create seamless, impactful experiences.








