w
eval
A Collective Intelligence Project
Home
Tags
Loading...
Evaluations Tagged: ...
Summary of results
Home
Tags
plausibility
Evaluations Tagged: "plausibility"
Summary of results
Hallucination Probe: Plausible Non-Existent Concepts
Hallucination
Factuality
Reasoning
Plausibility
77.9%
Avg. Hybrid Score
Latest Run Heatmap
Top Performing Model:
anthropic/claude-sonnet-4 (sys:1)
Avg. 88.7%
Latest:
Unique Versions:
4
View Latest Run Analysis
View All Runs for this Blueprint