GPT-4o: token counter & pricing
OpenAI · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- OpenAI
- API model ID
gpt-4o-2024-08-06- Context window
- 128,000 tokens
- Input price
- $2.50 per 1M tokens
- Output price
- $10.00 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for GPT-4o in real time.
What is GPT-4o?
GPT-4o ("o" for omni) is OpenAI's flagship general-purpose model, the workhorse for chat, coding, RAG, and most production AI workloads. Faster and cheaper than the older GPT-4 Turbo, with comparable or better quality on most benchmarks.
How tokens are counted here
GPT-4o introduced the o200k_base tokenizer that the entire GPT-5 family later inherited. We run that exact encoding in your browser with js-tiktoken: your prompt is never uploaded anywhere, and the result is exact rather than estimated. One caveat specific to GPT-4o: this counter measures text tokens only. Images you send to GPT-4o are billed separately by the API based on resolution and detail level, so a vision-heavy workload will cost more than the text count alone suggests.
Pricing notes
OpenAI charges separately for input and output tokens, with output ~4× the input rate. The "Per call" column above assumes the input/output split you set with the slider. The default 80/20 split reflects typical chat workloads where the prompt and history are large but the model's reply is short.
When to use GPT-4o over GPT-4o mini
- Tasks requiring strong reasoning (multi-step, multi-constraint).
- Code generation where structure matters.
- Anywhere quality has measurably mattered in your A/B tests.
For high-volume classification, extraction, or short Q&A, GPT-4o mini is ~17× cheaper with a small quality gap that's invisible for most workloads.
What GPT-4o costs in production
A workload GPT-4o still handles well is multimodal document processing: invoices, scanned forms, contracts with embedded figures. Say each document averages 3,000 input tokens of extracted text and instructions, returns a 500-token structured result, and you process 200,000 documents per month. That is 600M input tokens and 100M output tokens.
- Input: 600M tokens at $2.50/M = $1,500
- Output: 100M tokens at $10.00/M = $1,000
- Monthly total: $2,500 (plus per-image vision charges, billed separately)
The same volume on GPT-4o mini ($0.15 / $0.60) is $90 + $60 = $150 per month, a 94% cut for documents simple enough that retrieval and OCR did the hard part. Stepping up to Claude Sonnet 4.6 ($3.00 / $15.00) costs $1,800 + $1,500 = $3,300 per month. Route by document difficulty and the blended bill lands well under either extreme.
Migrating from GPT-4 Turbo
Change the apiId from gpt-4-turbo-2024-04-09 to gpt-4o-2024-08-06. Unlike most upgrades in this family, the tokenizer changes too: GPT-4 Turbo used cl100k_base, GPT-4o uses o200k_base, which produces roughly 5-10% fewer tokens for the same English text. Re-measure your prompt budgets rather than carrying old counts forward. The pricing move is large: $10.00 / $30.00 drops to $2.50 / $10.00 per 1M tokens, a 75% input cut. And once you are on o200k_base, counts do carry over to every GPT-5-era model, so this migration is the last re-measurement you should need.
GPT-4o vs the obvious alternative
Claude Sonnet 4.6 is the perennial head-to-head: $3.00 / $15.00 against GPT-4o's $2.50 / $10.00, so GPT-4o is cheaper per token on both sides, while Sonnet tends to win on instruction-following nuance. The full breakdown lives at GPT-4o vs Claude Sonnet. If your real question is whether you need this tier at all, see GPT-4o vs GPT-4o mini; for the premium direction, Claude Opus vs GPT-4o.
Common questions
Is the o200k tokenizer the same as cl100k (used by GPT-4 Turbo)?
No. o200k_base has a larger vocabulary (~200,000 tokens vs ~100,000) and produces fewer tokens for the same English text, usually 5-10% fewer. That's a real cost difference. We use the right tokenizer per model automatically.
How much does a typical chat exchange cost?
A 1,000-token prompt with a 200-token reply on GPT-4o: $0.0025 input + $0.002 output = $0.0045 per call, or $4,500 per million calls. Use the calculator above with your actual prompt to get the real number.
Why are some prompts cheaper on Claude than GPT-4o?
Different per-token prices and different tokenizers. Claude Sonnet is $3/$15 per million; GPT-4o is $2.50/$10. For input-heavy workloads with a short reply, GPT-4o usually wins on cost. For balanced or output-heavy workloads, Sonnet often wins. The calculator above shows the exact split for your prompt.
Compare GPT-4o to other models
- GPT-5.5 (OpenAI, $5.00/$30.00)
- GPT-5.5 Pro (OpenAI, $30.00/$180.00)
- GPT-5.4 (OpenAI, $2.50/$15.00)
- GPT-5.4 Mini (OpenAI, $0.75/$4.50)
- GPT-5.4 Nano (OpenAI, $0.20/$1.25)
- GPT-5.4 Pro (OpenAI, $30.00/$180.00)
- GPT-5.3 (OpenAI, $1.75/$14.00)
- GPT-5.2 (OpenAI, $1.75/$14.00)
- GPT-5.2 Pro (OpenAI, $21.00/$168.00)
- GPT-5.1 (OpenAI, $1.25/$10.00)
- GPT-5 (OpenAI, $1.25/$10.00)
- GPT-5 Mini (OpenAI, $0.25/$2.00)
- GPT-5 Nano (OpenAI, $0.05/$0.40)
- GPT-5 Pro (OpenAI, $15.00/$120.00)
- GPT-4.1 (OpenAI, $2.00/$8.00)
- GPT-4.1 Mini (OpenAI, $0.40/$1.60)
- GPT-4.1 Nano (OpenAI, $0.10/$0.40)
- o3 (OpenAI, $2.00/$8.00)
- o3-mini (OpenAI, $1.10/$4.40)
- o3-pro (OpenAI, $20.00/$80.00)
- o4-mini (OpenAI, $1.10/$4.40)
- GPT-4o mini (OpenAI, $0.15/$0.60)
- GPT-4 Turbo (OpenAI, $10.00/$30.00)
- Claude Sonnet 4.6 (Anthropic, $3.00/$15.00)
- Gemini 3.1 Pro (Google, $2.00/$12.00)
- Mistral Large (Mistral, $2.00/$6.00)