Claude Sonnet 4.6: Benchmark performance, how to try it

Anthropic has just done Issued Its latest Large Language Model (LLM), Cloud Sonnet 4.6. Tuesday’s release comes on the heels of the Feb. 5 launch of the company’s premium AI model, Cloud Opus 4.6.

According to Anthropic, “Cloud Sonnet 4.6 is our most capable Sonnet model to date.” The company says Sonnet 4.6 has a 1 million token reference window in beta. Importantly, Anthropic reports that the Sonnet 4.6 performed well on internal safety tests, showing hallucinations and a low tendency to engage in chatter.

“Sonet 4.6 brings better coding skills to the majority of our users,” Anthropic said, referring to the cloud’s popularity among developers who use AI to code.

If you’re looking to use Anthropic’s latest AI models, the company has made it really easy. Here’s how to access Clause Sonnet 4.6.

How to use Cloud Sonnet 4.6

For both free and Pro users, Cloud Sonnet 4.6 is now available as the default model on claude.ai and Cloud Cowork. Anthropic has also launched the model through its API and on all major cloud platforms.

Free users will have limited usage rates that depend on current demand. Limits reset every five hours. For those who need higher range, the Cloud Sonnet 4.6 is priced the same as the previous model. The Cloud Pro plan costs $20 per month or $17 per month if paid annually. If going through the API, Cloud Sonnet 4.6 starts at $3 per million input tokens and $15 per million output tokens.

Cloud Sonnet 4.6 benchmark performance

According to Anthropic’s benchmark tests, Cloud Sonnet 4.6 is the company’s most powerful model for agentive financial analysis and office tasks, beating out competitors like Google’s Gemini 3 Pro and OpenAI’s GPT 5.2.

On those tasks, Cloud Sonnet 4.6 outperforms even Anthropic’s own Opus 4.6, Anthropic’s most powerful AI model.

In its release announcement, Anthropic said that many developers with early access to Cloud Sonnet 4.6 preferred the model – not only to its predecessor, Cloud Sonnet 4.5, but also Cloud Opus 4.5. According to the Sonnet 4.6 system card, the new model improves on key benchmarks such as Humanities Last Exam, although Cloud Opus 4.6 scored higher.

benchmark performance

  • GPQA Diamond: 89.9 percent

  • ARC-AGI-2: 58.3 percent

  • MMMLU: 89.3 percent

  • SWE-Bench Verified: 79.6 percent

  • HLE (Humanity Final Exam): : 49.0 percent with equipment, 33.2 percent without equipment

AI-powered insurance company Pace told VentureBeat that Sonnet 4.6 scored the best of any cloud model on its complex insurance computer usage benchmark.

These results are notable because Cloud Opus models are generally more intelligent and better able to handle complex reasoning.

The Cloud Sonnet 4.6 is not only more powerful than some Opus models, but also more affordable. As mentioned earlier, Cloud Sonnet 4.6 is priced at $3/$15, while Opus 4.6 rates are $5/$25.

Subject
artificial intelligence



<a href

Leave a Comment