Gemini 3 Flash: token counter & pricing

Google · exact (uses official tokenizer) · pricing as of 2026-05-31.

Updated 2026-05-31 · By Clinton Patrick · Methodology

Provider: Google
API model ID: gemini-3-flash-preview
Context window: 1,000,000 tokens
Input price: $0.50 per 1M tokens
Output price: $3.00 per 1M tokens
Tokenizer accuracy: exact (uses official tokenizer)
Pricing as of: 2026-05-31

Open the counter to count tokens for Gemini 3 Flash in real time.

What is Gemini 3 Flash?

Gemini 3 Flash is Google's current Flash-tier model, released April 22, 2026. Combines Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost profile. Per Google's own benchmarks, it outperforms Gemini 2.5 Pro at a fraction of the cost while running 3× faster.

Currently in Preview. Pricing and behavior may change before GA.

How tokens are counted here

Token counts for Gemini 3 Flash come from Google's models.countTokens endpoint, proxied through our serverless layer, and are exact. That is worth knowing for a Preview model specifically: countTokens answers for the live model build, so if Google adjusts tokenization between Preview and GA, this counter tracks the change automatically instead of serving numbers from a frozen local copy.

Pricing notes

$0.50 input / $3.00 output per 1M tokens. Cached input $0.05/M.

That's substantially more expensive than Gemini 2.5 Flash-Lite ($0.10/$0.40) and Gemini 2.5 Flash ($0.30/$2.50), but Google positions Gemini 3 Flash as a Pro-tier replacement at Flash-tier prices. The math: if Gemini 3 Flash genuinely matches Gemini 2.5 Pro quality, paying $0.50 input vs $1.25 input on Pro is a real win for workloads that needed that capability tier.

When to use Gemini 3 Flash

Workloads currently on Gemini 2.5 Pro, substantial cost reduction with comparable quality.
Multimodal Pro-tier reasoning at scale, image, video, audio understanding with strong reasoning.
High-volume agentic workflows that need stronger reasoning than Flash-Lite can provide.

When not to use it:

Cheap classification / extraction, Gemini 2.5 Flash-Lite at $0.10/$0.40 is 5× cheaper and still exact-tokenizer.
Production-stable workloads. Preview status means pricing or behavior could shift.

What Gemini 3 Flash costs in production

Price a consumer chat product handling 30,000 messages a day, each carrying about 2,000 input tokens (system prompt plus conversation history) and producing a 350-token reply. That is 900,000 messages a month: 1.8B input tokens × $0.50 = $900, and 315M output tokens × $3.00 = $945, roughly $1,845 a month.

GPT-5 Mini at $0.25/$2.00 would run about $1,080 ($450 + $630) on the same traffic; Claude Haiku 4.5 at $1/$5 about $3,375 ($1,800 + $1,575). Chat history is also the textbook caching case: each turn resends the conversation so far, and Gemini 3 Flash's $0.05/M cached-input rate means that repeated prefix can cut the input line by close to 90%. If your product is chat-shaped, model the cached rate before comparing vendors on list price alone.

Migrating from Gemini 2.5 Flash

Switching from Gemini 2.5 Flash means changing the model string to apiId gemini-3-flash-preview and accepting two trade-offs. Price goes up: $0.50/$3.00 against 2.5 Flash's $0.30/$2.50, a 67% input increase you should justify with eval results, not benchmarks in a press release. Stability goes down: Preview models can change behavior or pricing before GA, so pin regression tests around anything load-bearing. Context stays at 1M and the SDK surface is identical. Counts on this page remain exact through the switch, since we call countTokens against whichever model you select.

Gemini 3 Flash vs the obvious alternative

Gemini 2.5 Flash is the alternative most teams should price first: $0.30/$2.50 against Gemini 3 Flash's $0.50/$3.00. If 2.5 Flash already clears your quality bar, the 67% input premium buys you nothing. If you were reaching for Gemini 2.5 Pro ($1.25/$10) for reasoning quality, Gemini 3 Flash is the cheaper route to that tier. For context on where the 2.5 line draws its quality boundaries, see Gemini 2.5 Flash vs Pro.

Common questions

How does Gemini 3 Flash compare to GPT-5.4 Mini?

Gemini 3 Flash: $0.50/$3.00. GPT-5.4 Mini: $0.75/$4.50. Flash is ~33% cheaper on input and 33% cheaper on output. Pick by your evals. Gemini's multimodal handling is broader; OpenAI's tool-use reliability is more mature.

What's the context window?

1,000,000 tokens (1M), same as Gemini 2.5 Flash. Larger than the GPT-5 family's 400K and Anthropic's 200K. Smaller than Gemini 2.5 Pro's 2M.

Will Gemini 3 Flash leave Preview?

Google hasn't published a GA date. Track the Gemini API release notes, once it leaves Preview, the pricing in this calculator will reflect any changes.

Compare Gemini 3 Flash to other models

Gemini 3.1 Pro (Google, $2.00/$12.00)
Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)
Gemini 2.5 Pro (Google, $1.25/$10.00)
Gemini 2.5 Flash (Google, $0.30/$2.50)
Gemini 2.5 Flash-Lite (Google, $0.10/$0.40)
Llama 3.1 70B (Meta, $0.59/$0.79)
GPT-4.1 Mini (OpenAI, $0.40/$1.60)
DeepSeek V3.1 (DeepSeek, $0.60/$1.70)