Wevala Collective Intelligence Project

About Our Methodology

Loading blueprint versions...

Please wait while we gather all the unique runs for this blueprint.

W

Weval a Collective Intelligence Project

Transparent, reproducible AI evaluations

Partners

Anthropic
Microsoft
Stanford University

Contact

[email protected]
Submit an evaluation
Documentation

Loading run instances...

Please wait while we find all executions for this version.

Weval

Hallucination Probe: Plausible Non-Existent Concepts

Run: 18e765a112f7711f

Instances for Run Label: 18e765a112f7711f (Blueprint: Hallucination Probe: Plausible Non-Existent Concepts)

Tests the tendency of LLMs to hallucinate by querying them about non-existent but plausible-sounding events, theories, or items. Ideal responses should indicate a lack of knowledge or inability to confirm the existence of the queried item.

TAGS:

Factual Accuracy & Hallucination

Instruction Following & Prompt Adherence

Helpfulness & Actionability

Back to All Runs for Blueprint: Hallucination Probe: Plausible Non-Existent Concepts

Showing all recorded executions for Run Label 18e765a112f7711f.

Executed:

Filename: 18e765a112f7711f_2025-08-09T04-40-13-595Z_comparison.json

Avg. Hybrid Score

76.2%

Model Variants

113

Test Cases

27