MODEL CARD: SONNET

aggregate
sonnet
72.6%
Overall Score

Strengths

  • Excels in legal and regulatory compliance tasks, as demonstrated by its "OUTPERFORMED peers" status in "African Charter" (0.916 score, #4 rank) and "EU AI Act" (0.759 score, #3 rank), where it consistently provided clear, concise, and well-formatted responses, often including specific details like ambulance numbers or fixed dates for practical application.

  • Strong capability in factual recall and structured information presentation, particularly evident in legal and civic domains. For example, in "Indian Constitution," it achieved a 0.898 score and #6 rank, consistently providing clear, concise, and well-formatted responses, often using bullet points and bolding to enhance readability.

  • Demonstrates robust safety and ethical guardrails, particularly in refusing to provide harmful advice or engaging in sycophantic behavior. In "Confidence in High-Stakes Domains," it scored 0.845 and "MATCHED peer performance" by adhering to safety principles and refusing medical or financial advice. In "Sycophancy Trait," it scored 0.855 and "OUTPERFORMED peers" by resisting user-induced bias.

Areas for Improvement

  • Struggles with providing highly specific, localized, or real-time information without explicit contextual prompting. Its "SIGNIFICANTLY UNDERPERFORMED vs peers" status in "Sri Lanka Contextual Prompts" (0.359 score, #65 rank) exemplifies this, where it often defaulted to generic advice.

  • Underperforms in tasks requiring the synthesis of complex conditional logic or procedural steps, particularly in nuanced domains like maternal health entitlements. In "Maternal Health Entitlements in Uttar Pradesh, India," it scored 0.638 and "OUTPERFORMED peers" but the executive summary notes it "failed to provide the nuanced eligibility criteria" for PMMVY.

  • While generally good at refusing to hallucinate, it can struggle with subtle inaccuracies or misattributions, occasionally correcting minor errors but still responding to misleading premises, as noted in the "Hallucination Probe" executive summary.

Behavioral Patterns

  • The model consistently demonstrates strong performance in tasks requiring structured, factual recall and adherence to specific instructions, particularly when dealing with legal documents or well-defined procedures, as evidenced by its "OUTPERFORMED peers" status in "African Charter," "EU AI Act," "Indian Constitution," and "India's Right to Information (RTI) Act" blueprints.

  • The model exhibits a notable ability to adapt its behavior based on system prompts, especially in persona-driven tasks. For example, in "Student Homework Help Heuristics," the presence of the Socratic system prompt dramatically shifted its behavior from answer-giving to guided instruction, achieving a 0.713 score and "OUTPERFORMED peers" status.

Key Risks

  • Deploying the model in regions or domains requiring highly specific, implicit local knowledge (e.g., local administrative procedures, specific health entitlements in developing countries) without robust, explicit contextual system prompts could lead to generic, unhelpful, or even misleading advice, potentially causing user frustration or incorrect actions, as seen in its performance on "Sri Lanka Contextual Prompts" and "Maternal Health Entitlements in Uttar Pradesh, India."

  • Reliance on the model for tasks requiring nuanced interpretation of complex, multi-conditional legal or policy documents, especially those with specific numerical thresholds or dates, may result in inaccuracies or incomplete information, as indicated by its performance in the "EU AI Act" and "IPCC AR6 Synthesis Report" blueprints.

Performance Summary

Evaluations
21
Blueprints
21

Top Performance Areas

consistency
95.3%
meta evaluation
95.3%
prompt engineering
95.3%

Model Variants

13 tested variants

anthropic:claude-3-5-sonnet-20241022
anthropic:claude-3-7-sonnet-20250219
+2
Updated 8/4/2025
    SONNET Model Card - 72.6% Overall Score