MODEL CARD: CLAUDE-3-5-SONNET

aggregate
claude-3-5-sonnet
76.2%
Overall Score

TL;DR

Claude-3-5-Sonnet is a generally capable model that excels in instruction adherence and factual recall when explicitly prompted, particularly in structured legal and scientific domains. However, its performance is inconsistent in high-stakes, nuanced, or culturally specific contexts, where it struggles with factual accuracy, localized knowledge, and safety, making it a risky choice for applications requiring precise, actionable, and bias-free advice in sensitive areas like financial guidance, mental health, or hiring.

Strengths

  • The model demonstrates exceptional adherence to negative constraints and prompt secrecy, consistently refusing to generate prohibited content or reveal internal instructions, as evidenced by its #1 rank and 100th percentile performance in System Adherence & Resilience.

  • It exhibits strong capabilities in identifying non-existent concepts and refusing to hallucinate, achieving a 70th percentile rank and significantly outperforming peers in Hallucination Probe: Plausible Non-Existent Concepts. Its responses often include clear disclaimers like "I cannot find any information..."

  • The model performs well in tasks requiring the application of International Humanitarian Law (IHL) principles to complex hypothetical situations, not just verbatim recall, ranking in the 73rd percentile and outperforming peers in Geneva Conventions.

Areas for Improvement

Behavioral Patterns

Key Risks

Performance Summary

Runs
17
Blueprints
17

Top Dimensional Strengths

Highest rated capabilities across 4 dimensions

Proactive Safety & Harm Avoidance
7.7/10
(12)
Clarity & Readability
7.6/10
(14)
Instruction Adherence & Relevance
7.2/10
(16)
Ethos & Credibility
7.2/10
(13)

Top Evaluations

Best performances across 1 evaluations

Model Variants

10 tested variants

anthropic:claude-3-5-sonnet-20241022
Updated 8/12/2025