Evaluations Tagged: ...

Showing all evaluation blueprints that have been tagged with...

Evaluations Tagged: "plausibility"

Showing all evaluation blueprints that have been tagged with "plausibility".

Hallucination Probe: Plausible Non-Existent Concepts

Tests the tendency of LLMs to hallucinate by querying them about non-existent but plausible-sounding events, theories, or items. Ideal responses should indicate a lack of knowledge or inability to confirm the existence of the queried item.

77.1%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint