Gemini 2.5 Pro: token counter & pricing
Google · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- API model ID
gemini-2.5-pro- Context window
- 2,000,000 tokens
- Input price
- $1.25 per 1M tokens
- Output price
- $10.00 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for Gemini 2.5 Pro in real time.
What is Gemini 2.5 Pro?
Gemini 2.5 Pro is Google's flagship multimodal model. Its standout feature is the 2-million-token context window, by far the largest in the frontier-model class.
How tokens are counted here
For Gemini 2.5 Pro we call Google's official models.countTokens endpoint through a serverless proxy, so the number you see is the number Google bills against. Counts are exact. That precision matters more here than on most models: Gemini 2.5 Pro's pricing steps up at exactly 200,000 input tokens ($1.25 to $2.50 per million), so an estimate that runs a couple of percent hot or cold can put a borderline prompt on the wrong side of a real billing threshold.
When the 2M context window matters
Most prompts don't need it. The contexts where it does:
- Loading an entire codebase (or a substantial subset) into a single prompt for refactoring or audit.
- Long-document Q&A without chunking and retrieval.
- Multi-document synthesis where retrieval would lose cross-document relationships.
For everything else, Gemini 2.5 Flash is roughly 4× cheaper on input ($0.30 vs $1.25 per million) and matches Pro on most short-context tasks.
Pricing notes
Google publishes a single input rate for ≤200k context. Above 200k, the input rate increases, verify on Google's pricing page if you're regularly working with very long contexts; this calculator assumes the ≤200k tier.
What Gemini 2.5 Pro costs in production
Take a contract-review pipeline that leans on the 2M window: 50 filings a day, each loaded whole at roughly 400,000 input tokens, returning a 2,000-token structured summary. Past 200k input, Google bills $2.50 per million in and $15 per million out, so each document costs 0.4M × $2.50 = $1.00 on input plus 0.002M × $15 = $0.03 on output, about $1.03. Across 22 working days that is 1,100 documents and roughly $1,133 per month.
Gemini 3 Flash would run the same job for about $0.21 per document ($227 per month) if its quality clears your bar, though its window caps at 1M. Claude Sonnet 4.6 at $3/$15 would cost about $1.23 per document ($1,353 per month) and cannot fit 400k tokens in its 200k window without chunking. Run a representative document through the counter above before committing either way.
Migrating from Gemini 1.5 Pro
Gemini 2.5 Pro is the intended landing spot for 1.5 Pro workloads. Point your code at apiId gemini-2.5-pro and most calls work unchanged: same SDK, same multimodal inputs, same countTokens endpoint. Three things to re-check. The context window doubled from 1M to 2M, so chunking logic built around the old limit can relax. The >200k pricing tier means budget alerts tuned to a single flat rate will under-predict long-context jobs. And because this page queries the live API, the counts you see reflect 2.5 Pro's tokenizer, not numbers carried over from 1.5.
Gemini 2.5 Pro vs the obvious alternative
The in-family question is whether to move up to Gemini 3.1 Pro at $2 input / $12 output. You pay 60% more on input for newer reasoning, and you give up the 2M window (3.1 Pro tops out at 1M). If your workload exists because of that window, stay put; if it is ordinary Pro-tier reasoning, the newer model deserves an eval. Full comparison: Gemini 2.5 Pro vs Gemini 3.1 Pro. Weighing the cheap sibling instead? See Gemini 2.5 Flash vs Pro.
Common questions
How does Gemini's tokenizer compare to GPT or Claude?
Gemini tends to produce slightly fewer tokens for the same English text than GPT-4o. The difference is small (single-digit percent) for typical text but can be larger for code or non-English content. The calculator above shows the actual count for your input.
Is the count_tokens endpoint free?
Yes. Google's countTokens endpoint is free to use, separate from generation costs. Our proxy adds caching so we don't burn quota on identical inputs.
How does Gemini 2.5 Pro compare to Claude Opus on price?
Pro: $1.25 input / $10 output per million. Claude Opus 4.8: $5 input / $25 output. Gemini is 4× cheaper on input, 2.5× cheaper on output, a major reason Gemini is winning long-context workloads. Opus still wins on certain reasoning benchmarks; choose by your task, not by the marketing.
Compare Gemini 2.5 Pro to other models
- Gemini 3.1 Pro (Google, $2.00/$12.00)
- Gemini 3 Flash (Google, $0.50/$3.00)
- Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)
- Gemini 2.5 Flash (Google, $0.30/$2.50)
- Gemini 2.5 Flash-Lite (Google, $0.10/$0.40)
- GPT-5.1 (OpenAI, $1.25/$10.00)
- GPT-5 (OpenAI, $1.25/$10.00)
- o3-mini (OpenAI, $1.10/$4.40)