Universal Declaration of Human Rights

Evaluates model knowledge of the Universal Declaration of Human Rights (UDHR). Prompts cover the Preamble and key articles on fundamental rights (e.g., life, liberty, equality, privacy, expression). Includes a scenario to test reasoning on balancing competing rights.

TAGS:

FEATURED

PERIODIC

Human Rights

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Reasoning

Long Form Question Answering

Clarity & Readability

Best Models (Coverage)

1.Grok 3 Mini
99.2%
2.GPT 4.1
98.9%
3.Grok 3
98.9%
4.Gemini 2.5 Flash
98.6%
5.Command A
98.1%

🤔 Most Differentiating Prompt

User: Detail the right to education as outlined in Article 26 of the UDHR, including its aims and parental rights.

σ = 0.154

🔀 Least Similar Models

Gemini 2.5 Pro Preview 05 06vsGPT 4o

84.8% similarity

👯 Most Similar Models

GPT 4.1 MinivsGPT 4.1 Nano

95.1% similarity

View Blueprint

Select Prompt:

Macro Coverage Overview

Average key point coverage extent for each model across all prompts.

Pro Tip

Click on any result cell to open a detailed view.

Advanced view

Highlight best performers

Sort prompts by

Sort models by

Color Scale - Simplified View (Avg. Coverage)

Perfect

Excellent

Good

Fair

Poor

Bad

Not Met

	Claude 3.5 Haiku	Claude Sonnet 4	Command A	Deepseek Chat V3	Gemini 2.5 Flash	Gemini 2.5 Pro Preview 05 06	Mistral Large 2411	Mistral Medium 3	GPT 4.1	GPT 4.1 Mini	GPT 4.1 Nano	GPT 4o	GPT 4o Mini	Grok 3	Grok 3 Mini
Score	12th 94.2%	8th 96.8%	5th 98.1%	11th 94.8%	4th 98.6%	14th 87.5%	6th 97.0%	9th 95.6%	3rd 98.9%	6th 97.0%	15th 86.2%	10th 95.3%	13th 88.3%	2nd 98.9%	1st 99.2%
98.3%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	75%	100%	100%	100%	100%
98.7%	100%	98%	100%	100%	100%	92%	100%	98%	100%	100%	100%	100%	92%	100%	100%
99.1%	100%	100%	100%	100%	100%	98%	100%	98%	100%	100%	92%	100%	98%	100%	100%
99.5%	95%	100%	100%	100%	98%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
93.0%	83%	92%	100%	77%	100%	97%	100%	100%	100%	98%	83%	92%	75%	98%	100%
96.5%	98%	95%	100%	100%	100%	100%	100%	100%	100%	100%	83%	95%	78%	98%	100%
90.5%	90%	100%	100%	100%	100%	55%	90%	89%	100%	100%	60%	100%	73%	100%	100%
100.0%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
100.0%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
94.7%	100%	100%	100%	91%	100%	88%	88%	88%	100%	100%	100%	81%	84%	100%	100%
79.4%	76%	82%	85%	79%	90%	56%	92%	82%	87%	66%	58%	81%	71%	94%	92%
91.3%	88%	94%	92%	91%	95%	64%	94%	92%	100%	100%	83%	94%	88%	97%	98%

Model Similarity Dendrogram

Hierarchical clustering of models based on response similarity. Models grouped closer are more similar.