Claude Sonnet 4.6: token counter & pricing
Anthropic · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- Anthropic
- API model ID
claude-sonnet-4-6- Context window
- 200,000 tokens
- Input price
- $3.00 per 1M tokens
- Output price
- $15.00 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for Claude Sonnet 4.6 in real time.
What is Claude Sonnet 4.6?
Claude Sonnet 4.6 is Anthropic's mid-tier model, the workhorse most production Claude workloads should default to. Strong reasoning, strong code, strong writing, at 5× lower input cost than Opus and 5× lower output cost than Opus.
How tokens are counted here
For Sonnet we call Anthropic's official /v1/messages/count_tokens endpoint through a serverless proxy and show you the raw result. That makes the count exact: it is the same number Anthropic's billing system uses, not a client-side approximation.
Privacy-wise, the proxy is a pass-through. Your prompt goes to Anthropic's tokenization endpoint and nowhere else; per Anthropic's count_tokens policy it is not logged, not stored, and not used for training.
One useful side effect of the shared Claude tokenizer: a Sonnet 4.6 count is also a valid Haiku 4.5 count for the same text, so you can price out a Sonnet-to-Haiku downgrade from a single measurement. The exception is Opus 4.8, whose newer tokenizer can run up to 35% higher on identical input.
What Claude Sonnet costs in production
A concrete case: a customer-support assistant that drafts replies for 2,000 tickets a day, with each ticket sending about 1,500 tokens of conversation history and knowledge-base context and producing a 300-token draft.
- Input: 2,000 × 1,500 = 3M tokens/day, or 90M tokens over a 30-day month
- Output: 2,000 × 300 = 600,000 tokens/day, or 18M tokens/month
- Cost: 90M × $3 = $270 input, plus 18M × $15 = $270 output. About $540/month.
The same volume on GPT-4o ($2.50/$10) bills $225 + $180 = $405/month, roughly 25% less. On Claude Opus 4.8 ($5/$25) it bills $450 + $450 = $900/month for a quality difference most support workloads will not notice. Sonnet's spot in the middle is the point: cheap enough to run at ticket volume, capable enough that you are not constantly escalating.
Prompt caching changes this math further if your knowledge-base context repeats across tickets; cached input runs about 10% of the standard rate.
Migrating from Sonnet 4.5
Swap the model string from claude-sonnet-4-5 to claude-sonnet-4-6 and the bill stays put: pricing held at $3/$15 across the generation, and Sonnet 4.6 keeps the shared Claude tokenizer, so token counts for existing prompts do not move. The 200,000-token context window is also unchanged. In practice the migration risk is behavioral, not financial: 4.6 follows instructions more tightly, which can surface places where your prompts relied on the model ignoring a sloppy constraint. Re-run your eval set before flipping production traffic, but skip the cost re-estimate, there is nothing to re-estimate.
Claude Sonnet vs the obvious alternative
The closest competitor is GPT-4o, compared in depth at GPT-4o vs Claude Sonnet: $2.50/$10 versus Sonnet's $3/$15, so GPT-4o wins raw price while Sonnet tends to win on instruction-following and long-context behavior. If you are tempted to go cheaper still, Claude Sonnet vs GPT-4o mini covers the 20x price gap ($0.15/$0.60) and what you give up to get it.
When to use Sonnet over Opus or Haiku
- Most production workloads. Sonnet is the default; reach for Opus only when you've measured Sonnet falling short on your task.
- Long-form writing where quality matters but Opus is overkill, most blog posts, emails, summaries.
- Code generation and review on routine diffs. Opus is worth the 5× premium only on architecture-class problems.
- RAG with substantial context, the 200,000-token window handles most documents in one shot.
If your workload is high-volume classification, extraction, or short Q&A, Claude Haiku is 4× cheaper with quality differences invisible to most users.
Common questions
How does Sonnet compare to GPT-4o on price?
Sonnet: $3/$15 per million. GPT-4o: $2.50/$10. GPT-4o is ~17% cheaper on input, 33% cheaper on output. For input-heavy workloads with short replies, GPT-4o usually wins on raw cost. Sonnet often wins on instruction-following nuance, measure with your prompts.
Does the 200,000-token context window cost more?
No. Input is billed per token regardless of context-window position. A 100k-token prompt costs the same per token as a 1k-token prompt, the total just scales with the number of tokens you send.
Is prompt caching available on Sonnet?
Yes. Anthropic's prompt caching reduces cost on repeated long-context prompts (e.g., the same RAG document across many queries). Cached input tokens cost ~10% of the standard rate. Not reflected in the calculator above; factor it in manually if your workload qualifies.
Compare Claude Sonnet 4.6 to other models
- Claude Opus 4.8 (Anthropic, $5.00/$25.00)
- Claude Opus 4.8 (Fast Mode) (Anthropic, $10.00/$50.00)
- Claude Haiku 4.5 (Anthropic, $1.00/$5.00)
- DeepSeek R1 (DeepSeek, $3.00/$7.00)
- GPT-5.4 (OpenAI, $2.50/$15.00)
- GPT-4o (OpenAI, $2.50/$10.00)