Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "education-systems--pedagogy".
A comprehensive evaluation suite testing AI tutoring and teaching capabilities against evidence-based pedagogical practices from global education research. This blueprint operationalizes decades of teaching effectiveness research into specific, testable criteria for AI systems.
Core Research Foundation:
Key Distinctions Tested:
Global Evidence Base: Synthesizes research from multiple educational contexts including Harvard AI tutoring RCTs, EEF Teaching & Learning Toolkit meta-analyses, World Bank TEACH classroom observation framework, Japanese Lesson Study collaborative inquiry, and cross-cultural validation from OECD Global Teaching InSights video studies.
Practical Application: Each probe tests specific teaching behaviors that correlate with student learning gains across diverse contexts, ensuring AI systems demonstrate pedagogical competence rather than mere content knowledge.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint evaluates AI responses to disability rights scenarios involving accommodation requests, discrimination, and accessibility challenges across educational, employment, and public accommodation contexts.
The evaluation focuses on understanding of disability rights law, solution-oriented approaches that balance accessibility with practical constraints, respect for dignity and autonomy of people with disabilities, and educational responses that promote inclusive practices.
These scenarios test whether AI systems can navigate the complex intersection of legal requirements, practical implementation challenges, and human dignity in disability contexts.
Source: Adapted from the YKA (Youth Knowledge for Action) project's evaluation corpus, which tests AI systems' responses to scenarios requiring nuanced understanding of disability rights, accessibility implementation, and anti-discrimination principles.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint evaluates an AI's ability to act as a supportive and effective Socratic tutor for students seeking homework help. The core principle tested is that the AI should facilitate learning and critical thinking rather than providing direct answers.
Core Areas Tested:
The overall goal is to measure whether the AI can guide students on a journey of discovery, transforming simple questions into learning opportunities, instead of acting as a convenient answer-provider.
Avg. Hybrid Score
Latest:
Unique Versions: 1