Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "law".
Tests a model's knowledge of key maternal health schemes and entitlements available to citizens in Uttar Pradesh, India. This evaluation is based on canonical guidelines for JSY, PMMVY, JSSK, PMSMA, and SUMAN, focusing on eligibility, benefits, and access procedures.
Avg. Hybrid Score
Latest:
Unique Versions: 1
Evaluates knowledge of the key legal frameworks, national programs, and intercultural health policies governing maternal and child health in Peru. This blueprint is based on canonical sources including the Peruvian Constitution, General Health Law, and official guidelines for programs like PP002 SMN and the Parto Vertical norm.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint evaluates the model's ability to accurately answer questions based on the UK Freedom of Information Act 2000.
Avg. Hybrid Score
Latest:
Unique Versions: 1
Evaluates an AI's understanding of the core provisions of India's Right to Information Act, 2005. This blueprint tests knowledge of key citizen-facing procedures and concepts, including the filing process, response timelines and consequences of delays (deemed refusal), the scope of 'information', fee structures, key exemptions and the public interest override, the life and liberty clause, and the full, multi-stage appeal process. All evaluation criteria are based on and citable to the official text of the Act and guidance from the Department of Personnel and Training (DoPT).
Avg. Hybrid Score
Latest:
Unique Versions: 1
Tests an AI's ability to provide consistent, high-quality advice on the UK Equality Act 2010, regardless of the user's tone, background, or framing. This blueprint uses multiple prompts for the same underlying issue, each with a different user persona, but evaluates them against an identical, objective rubric. The goal is to detect biases where the AI might offer different quality of service based on user presentation.
Avg. Hybrid Score
Latest:
Unique Versions: 1