Claude Sonnet 4.6: token counter & pricing

Anthropic · exact (uses official tokenizer) · pricing as of 2026-05-31.

Updated 2026-05-31 · By Clinton Patrick · Methodology

Provider: Anthropic
API model ID: claude-sonnet-4-6
Context window: 200,000 tokens
Input price: $3.00 per 1M tokens
Output price: $15.00 per 1M tokens
Tokenizer accuracy: exact (uses official tokenizer)
Pricing as of: 2026-05-31

Open the counter to count tokens for Claude Sonnet 4.6 in real time.

What is Claude Sonnet 4.6?

Claude Sonnet 4.6 is Anthropic's mid-tier model, the workhorse most production Claude workloads should default to. Strong reasoning, strong code, strong writing, at 5× lower input cost than Opus and 5× lower output cost than Opus.

How tokens are counted here

For Sonnet we call Anthropic's official /v1/messages/count_tokens endpoint through a serverless proxy and show you the raw result. That makes the count exact: it is the same number Anthropic's billing system uses, not a client-side approximation.

Privacy-wise, the proxy is a pass-through. Your prompt goes to Anthropic's tokenization endpoint and nowhere else; per Anthropic's count_tokens policy it is not logged, not stored, and not used for training.

One useful side effect of the shared Claude tokenizer: a Sonnet 4.6 count is also a valid Haiku 4.5 count for the same text, so you can price out a Sonnet-to-Haiku downgrade from a single measurement. The exception is Opus 4.8, whose newer tokenizer can run up to 35% higher on identical input.

What Claude Sonnet costs in production

A concrete case: a customer-support assistant that drafts replies for 2,000 tickets a day, with each ticket sending about 1,500 tokens of conversation history and knowledge-base context and producing a 300-token draft.

Input: 2,000 × 1,500 = 3M tokens/day, or 90M tokens over a 30-day month
Output: 2,000 × 300 = 600,000 tokens/day, or 18M tokens/month
Cost: 90M × $3 = $270 input, plus 18M × $15 = $270 output. About $540/month.

The same volume on GPT-4o ($2.50/$10) bills $225 + $180 = $405/month, roughly 25% less. On Claude Opus 4.8 ($5/$25) it bills $450 + $450 = $900/month for a quality difference most support workloads will not notice. Sonnet's spot in the middle is the point: cheap enough to run at ticket volume, capable enough that you are not constantly escalating.

Prompt caching changes this math further if your knowledge-base context repeats across tickets; cached input runs about 10% of the standard rate.

Migrating from Sonnet 4.5

Swap the model string from claude-sonnet-4-5 to claude-sonnet-4-6 and the bill stays put: pricing held at $3/$15 across the generation, and Sonnet 4.6 keeps the shared Claude tokenizer, so token counts for existing prompts do not move. The 200,000-token context window is also unchanged. In practice the migration risk is behavioral, not financial: 4.6 follows instructions more tightly, which can surface places where your prompts relied on the model ignoring a sloppy constraint. Re-run your eval set before flipping production traffic, but skip the cost re-estimate, there is nothing to re-estimate.

Claude Sonnet vs the obvious alternative

The closest competitor is GPT-4o, compared in depth at GPT-4o vs Claude Sonnet: $2.50/$10 versus Sonnet's $3/$15, so GPT-4o wins raw price while Sonnet tends to win on instruction-following and long-context behavior. If you are tempted to go cheaper still, Claude Sonnet vs GPT-4o mini covers the 20x price gap ($0.15/$0.60) and what you give up to get it.

When to use Sonnet over Opus or Haiku

Most production workloads. Sonnet is the default; reach for Opus only when you've measured Sonnet falling short on your task.
Long-form writing where quality matters but Opus is overkill, most blog posts, emails, summaries.
Code generation and review on routine diffs. Opus is worth the 5× premium only on architecture-class problems.
RAG with substantial context, the 200,000-token window handles most documents in one shot.

If your workload is high-volume classification, extraction, or short Q&A, Claude Haiku is 4× cheaper with quality differences invisible to most users.

Common questions

How does Sonnet compare to GPT-4o on price?

Sonnet: $3/$15 per million. GPT-4o: $2.50/$10. GPT-4o is ~17% cheaper on input, 33% cheaper on output. For input-heavy workloads with short replies, GPT-4o usually wins on raw cost. Sonnet often wins on instruction-following nuance, measure with your prompts.

Does the 200,000-token context window cost more?

No. Input is billed per token regardless of context-window position. A 100k-token prompt costs the same per token as a 1k-token prompt, the total just scales with the number of tokens you send.

Is prompt caching available on Sonnet?

Yes. Anthropic's prompt caching reduces cost on repeated long-context prompts (e.g., the same RAG document across many queries). Cached input tokens cost ~10% of the standard rate. Not reflected in the calculator above; factor it in manually if your workload qualifies.

Compare Claude Sonnet 4.6 to other models

Claude Opus 4.8 (Anthropic, $5.00/$25.00)
Claude Opus 4.8 (Fast Mode) (Anthropic, $10.00/$50.00)
Claude Haiku 4.5 (Anthropic, $1.00/$5.00)
DeepSeek R1 (DeepSeek, $3.00/$7.00)
GPT-5.4 (OpenAI, $2.50/$15.00)
GPT-4o (OpenAI, $2.50/$10.00)