Gemini 3 Flash: token counter & pricing
Google · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- API model ID
gemini-3-flash-preview- Context window
- 1,000,000 tokens
- Input price
- $0.50 per 1M tokens
- Output price
- $3.00 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for Gemini 3 Flash in real time.
What is Gemini 3 Flash?
Gemini 3 Flash is Google's current Flash-tier model, released April 22, 2026. Combines Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost profile. Per Google's own benchmarks, it outperforms Gemini 2.5 Pro at a fraction of the cost while running 3× faster.
Currently in Preview. Pricing and behavior may change before GA.
How tokens are counted here
Token counts for Gemini 3 Flash come from Google's models.countTokens endpoint, proxied through our serverless layer, and are exact. That is worth knowing for a Preview model specifically: countTokens answers for the live model build, so if Google adjusts tokenization between Preview and GA, this counter tracks the change automatically instead of serving numbers from a frozen local copy.
Pricing notes
$0.50 input / $3.00 output per 1M tokens. Cached input $0.05/M.
That's substantially more expensive than Gemini 2.5 Flash-Lite ($0.10/$0.40) and Gemini 2.5 Flash ($0.30/$2.50), but Google positions Gemini 3 Flash as a Pro-tier replacement at Flash-tier prices. The math: if Gemini 3 Flash genuinely matches Gemini 2.5 Pro quality, paying $0.50 input vs $1.25 input on Pro is a real win for workloads that needed that capability tier.
When to use Gemini 3 Flash
- Workloads currently on Gemini 2.5 Pro, substantial cost reduction with comparable quality.
- Multimodal Pro-tier reasoning at scale, image, video, audio understanding with strong reasoning.
- High-volume agentic workflows that need stronger reasoning than Flash-Lite can provide.
When not to use it:
- Cheap classification / extraction, Gemini 2.5 Flash-Lite at $0.10/$0.40 is 5× cheaper and still exact-tokenizer.
- Production-stable workloads. Preview status means pricing or behavior could shift.
What Gemini 3 Flash costs in production
Price a consumer chat product handling 30,000 messages a day, each carrying about 2,000 input tokens (system prompt plus conversation history) and producing a 350-token reply. That is 900,000 messages a month: 1.8B input tokens × $0.50 = $900, and 315M output tokens × $3.00 = $945, roughly $1,845 a month.
GPT-5 Mini at $0.25/$2.00 would run about $1,080 ($450 + $630) on the same traffic; Claude Haiku 4.5 at $1/$5 about $3,375 ($1,800 + $1,575). Chat history is also the textbook caching case: each turn resends the conversation so far, and Gemini 3 Flash's $0.05/M cached-input rate means that repeated prefix can cut the input line by close to 90%. If your product is chat-shaped, model the cached rate before comparing vendors on list price alone.
Migrating from Gemini 2.5 Flash
Switching from Gemini 2.5 Flash means changing the model string to apiId gemini-3-flash-preview and accepting two trade-offs. Price goes up: $0.50/$3.00 against 2.5 Flash's $0.30/$2.50, a 67% input increase you should justify with eval results, not benchmarks in a press release. Stability goes down: Preview models can change behavior or pricing before GA, so pin regression tests around anything load-bearing. Context stays at 1M and the SDK surface is identical. Counts on this page remain exact through the switch, since we call countTokens against whichever model you select.
Gemini 3 Flash vs the obvious alternative
Gemini 2.5 Flash is the alternative most teams should price first: $0.30/$2.50 against Gemini 3 Flash's $0.50/$3.00. If 2.5 Flash already clears your quality bar, the 67% input premium buys you nothing. If you were reaching for Gemini 2.5 Pro ($1.25/$10) for reasoning quality, Gemini 3 Flash is the cheaper route to that tier. For context on where the 2.5 line draws its quality boundaries, see Gemini 2.5 Flash vs Pro.
Common questions
How does Gemini 3 Flash compare to GPT-5.4 Mini?
Gemini 3 Flash: $0.50/$3.00. GPT-5.4 Mini: $0.75/$4.50. Flash is ~33% cheaper on input and 33% cheaper on output. Pick by your evals. Gemini's multimodal handling is broader; OpenAI's tool-use reliability is more mature.
What's the context window?
1,000,000 tokens (1M), same as Gemini 2.5 Flash. Larger than the GPT-5 family's 400K and Anthropic's 200K. Smaller than Gemini 2.5 Pro's 2M.
Will Gemini 3 Flash leave Preview?
Google hasn't published a GA date. Track the Gemini API release notes, once it leaves Preview, the pricing in this calculator will reflect any changes.
Compare Gemini 3 Flash to other models
- Gemini 3.1 Pro (Google, $2.00/$12.00)
- Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)
- Gemini 2.5 Pro (Google, $1.25/$10.00)
- Gemini 2.5 Flash (Google, $0.30/$2.50)
- Gemini 2.5 Flash-Lite (Google, $0.10/$0.40)
- Llama 3.1 70B (Meta, $0.59/$0.79)
- GPT-4.1 Mini (OpenAI, $0.40/$1.60)
- DeepSeek V3.1 (DeepSeek, $0.60/$1.70)