Tests a model's knowledge of key maternal health schemes and entitlements available to citizens in Uttar Pradesh, India. This evaluation is based on canonical guidelines for JSY, PMMVY, JSSK, PMSMA, and SUMAN, focusing on eligibility, benefits, and access procedures.
Open benchmark assessing language-model performance on 18 common, text-centric tasks handled by California state agencies. Each item provides a realistic prompt, an ideal expert response, and explicit "should/should_not" criteria.