GPT-4o mini: token counter & pricing
OpenAI · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- OpenAI
- API model ID
gpt-4o-mini- Context window
- 128,000 tokens
- Input price
- $0.15 per 1M tokens
- Output price
- $0.60 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for GPT-4o mini in real time.
What is GPT-4o mini?
GPT-4o mini is OpenAI's small, fast, cheap model, designed for high-volume workloads where GPT-4o is overkill. 17× cheaper than GPT-4o on input, 17× cheaper on output, while keeping the same o200k_base tokenizer and 128k context window.
How tokens are counted here
Token counts on this page come from o200k_base, the identical tokenizer full GPT-4o uses, computed locally with js-tiktoken. Your text stays in the browser and the count is exact, not an estimate. Because mini and GPT-4o tokenize identically, comparing the two is pure rate arithmetic: same count, different price per token, which is exactly what happens when you flip between them in the calculator above. The 17x price gap you see is the whole story; there is no hidden tokenization difference adding or removing cost.
When to use GPT-4o mini
- High-volume classification and extraction. Same accuracy as GPT-4o on most labeling tasks.
- Short chat replies where the model isn't doing heavy reasoning.
- First-pass filtering in pipelines that escalate harder cases to GPT-4o or Claude Opus.
- RAG over routine documents where the retrieval did the heavy lifting.
When not to use it: anything requiring careful multi-step reasoning, structured planning, or nuanced instruction-following on subtle constraints. The quality gap to GPT-4o is real on these.
Pricing notes
At $0.15/$0.60 per million, GPT-4o mini is one of the cheapest frontier-tier models. Direct competitors:
- Claude Haiku 4.5 ($1.00/$5.00), better instruction-following, much more expensive.
- Gemini 2.5 Flash ($0.30/$2.50), pricier per token than GPT-4o mini but with a 1M-token context.
- Llama 3.1 8B (~$0.18/$0.18), comparable price, hosted via Together/Groq, lower quality on most benchmarks.
For most price-sensitive workloads, the choice is GPT-4o mini vs Gemini Flash. Test both; the winner depends entirely on your prompt distribution.
What GPT-4o mini costs in production
The textbook mini deployment is a customer support chatbot. Each message carries about 2,000 input tokens (system prompt, conversation history, a retrieved help-center snippet) and gets a 300-token reply. At 500,000 messages per month, that is 1 billion input tokens and 150M output tokens.
- Input: 1,000M tokens at $0.15/M = $150
- Output: 150M tokens at $0.60/M = $90
- Monthly total: $240
Squeezing further, Gemini 2.5 Flash-Lite ($0.10 / $0.40) handles the same traffic for $100 + $60 = $160 per month, a third less. Going upmarket, Claude Haiku 4.5 ($1.00 / $5.00) costs $1,000 + $750 = $1,750 per month for identical volume, about 7x. For a support bot where most questions are routine, that spread usually funds a human escalation path instead.
Migrating from GPT-3.5 Turbo
OpenAI deprecated the gpt-3.5-turbo line in favor of mini, so this migration is often forced rather than chosen. Update the apiId from gpt-3.5-turbo to gpt-4o-mini. The tokenizer changes from cl100k_base to o200k_base, which yields roughly 5-10% fewer tokens for the same English text, so re-measure prompt budgets instead of porting old counts. The good news: o200k_base is the tokenizer for everything OpenAI ships now, GPT-4o through the GPT-5 family, so counts measured here carry forward to any future upgrade. Capability rises across the board; spot-check your edge cases anyway.
GPT-4o mini vs the obvious alternative
DeepSeek V3 ($0.27 / $1.10 via DeepSeek's own API) is the open-weight model most often benchmarked against mini: pricier per token than mini's $0.15 / $0.60, but strong on reasoning for the tier. The numbers and tradeoffs are laid out at DeepSeek V3 vs GPT-4o mini. Inside OpenAI's lineup, GPT-4o vs GPT-4o mini covers when the 17x premium is worth paying, and Claude Sonnet vs GPT-4o mini covers the cross-provider quality jump.
Common questions
Is GPT-4o mini just GPT-3.5 with new branding?
No. It's a distinct model with substantially better benchmark scores than the old gpt-3.5-turbo line, but priced at roughly 30% of the cost. OpenAI deprecated gpt-3.5-turbo in favor of mini.
Does mini support function calling and structured outputs?
Yes, same OpenAI features as GPT-4o (function calling, JSON mode, structured outputs with schema). The capability surface is the same; the only difference is reasoning quality on hard prompts.
What's a typical cost for a chat exchange?
A 500-token prompt with a 100-token reply: $0.000075 input + $0.00006 output = $0.000135 per call, or $135 per million calls. Use the calculator above with your real prompt for an accurate number.
Compare GPT-4o mini to other models
- GPT-5.5 (OpenAI, $5.00/$30.00)
- GPT-5.5 Pro (OpenAI, $30.00/$180.00)
- GPT-5.4 (OpenAI, $2.50/$15.00)
- GPT-5.4 Mini (OpenAI, $0.75/$4.50)
- GPT-5.4 Nano (OpenAI, $0.20/$1.25)
- GPT-5.4 Pro (OpenAI, $30.00/$180.00)
- GPT-5.3 (OpenAI, $1.75/$14.00)
- GPT-5.2 (OpenAI, $1.75/$14.00)
- GPT-5.2 Pro (OpenAI, $21.00/$168.00)
- GPT-5.1 (OpenAI, $1.25/$10.00)
- GPT-5 (OpenAI, $1.25/$10.00)
- GPT-5 Mini (OpenAI, $0.25/$2.00)
- GPT-5 Nano (OpenAI, $0.05/$0.40)
- GPT-5 Pro (OpenAI, $15.00/$120.00)
- GPT-4.1 (OpenAI, $2.00/$8.00)
- GPT-4.1 Mini (OpenAI, $0.40/$1.60)
- GPT-4.1 Nano (OpenAI, $0.10/$0.40)
- o3 (OpenAI, $2.00/$8.00)
- o3-mini (OpenAI, $1.10/$4.40)
- o3-pro (OpenAI, $20.00/$80.00)
- o4-mini (OpenAI, $1.10/$4.40)
- GPT-4o (OpenAI, $2.50/$10.00)
- GPT-4 Turbo (OpenAI, $10.00/$30.00)
- Llama 3.1 8B (Meta, $0.18/$0.18)
- Gemini 2.5 Flash-Lite (Google, $0.10/$0.40)
- Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)