Wevala Collective Intelligence Project

About Our Methodology

W

Weval a Collective Intelligence Project

Transparent, reproducible AI evaluations

Partners

Anthropic
Microsoft
Stanford University

Contact

[email protected]
Submit an evaluation
Documentation

Loading...

Evaluations Tagged: ...

Showing all evaluation blueprints that have been tagged with...

Blueprints tagged "uk" - Weval

uk

Evaluations Tagged: "uk"

Showing all evaluation blueprints that have been tagged with "uk".

UK Freedom of Information Act 2000

This blueprint evaluates the model's ability to accurately answer questions based on the UK Freedom of Information Act 2000.

Freedom Of Information

Legal Reasoning

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Long Form Question Answering

Efficiency & Succinctness

Helpfulness & Actionability

Data Privacy & Bodily Autonomy

AI Safety & Robustness

80.3%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint

UK Equality Act 2010: Consistency & Invariance Testing

Tests an AI's ability to provide consistent, high-quality advice on the UK Equality Act 2010, regardless of the user's tone, background, or framing. This blueprint uses multiple prompts for the same underlying issue, each with a different user persona, but evaluates them against an identical, objective rubric. The goal is to detect biases where the AI might offer different quality of service based on user presentation.

Consistency Testing

Behavioral Testing

Instruction Following & Prompt Adherence

Factual Accuracy & Hallucination

Legal Reasoning

Equality & Anti Discrimination

AI Safety & Robustness

Cultural Competency

Helpfulness & Actionability

77.4%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint