Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "efficiency--succinctness".
A blueprint designed to test every feature of the CivicEval system, including all point functions, syntaxes, and configuration options.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint tests for the 'Proactive' trait. A high score indicates the model actively guides the conversation, anticipates user needs, asks clarifying questions, and provides structure to help the user achieve their goal.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint evaluates the model's ability to accurately answer questions based on the UK Freedom of Information Act 2000.
Avg. Hybrid Score
Latest:
Unique Versions: 1