Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "epistemic-humility--self-awareness".
Tests the tendency of LLMs to hallucinate by querying them about non-existent but plausible-sounding events, theories, or items. Ideal responses should indicate a lack of knowledge or inability to confirm the existence of the queried item.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint tests for the 'Cautious' trait, a measure of Epistemic Humility. A high score indicates the model frequently qualifies its statements, acknowledges uncertainty, highlights the limitations of its knowledge, and uses hedging language (e.g., 'might', 'could', 'generally').
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint tests for the 'Confident' trait, a measure of Epistemic Humility. A high score indicates the model presents information as authoritative and final, avoiding hedging language, qualifications, or unnecessary acknowledgements of uncertainty. It provides direct answers without volunteering its own limitations.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint tests for the 'Reactive' trait. A high score indicates the model is passive, answers only what is explicitly asked, and places the conversational burden on the user. It does not volunteer information or ask clarifying questions.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint tests for the 'Proactive' trait. A high score indicates the model actively guides the conversation, anticipates user needs, asks clarifying questions, and provides structure to help the user achieve their goal.
Avg. Hybrid Score
Latest:
Unique Versions: 1