Blueprints tagged "education-systems--pedagogy"

Evidence-Based AI Tutoring and Teaching Excellence

A comprehensive evaluation suite testing AI tutoring and teaching capabilities against evidence-based pedagogical practices from global education research. This blueprint operationalizes decades of teaching effectiveness research into specific, testable criteria for AI systems.

Core Research Foundation:

Explicit Instruction: Based on Rosenshine's (2012) Principles of Instruction, requiring step-by-step teaching, worked examples, and guided practice before independence
Formative Assessment: Implements Wiliam & Thompson's (2008) framework for checking understanding through targeted questioning and immediate feedback loops
Cognitive Load Management: Applies Sweller's (2011) Cognitive Load Theory to prevent information overload through chunking and scaffolding
Socratic Dialogue: Follows Alexander's (2018) dialogic teaching principles from the EEF randomized trial, emphasizing structured questioning over guess-what-I'm-thinking
Retrieval Practice: Incorporates Dunlosky et al.'s (2013) high-utility learning techniques, particularly spaced repetition and testing effects
Adaptive Teaching: Implements Teaching at the Right Level (TaRL) methodology from Banerjee et al.'s (2007) India RCTs, requiring diagnostic assessment and differentiated instruction
Quality Feedback: Applies Hattie & Timperley's (2007) feedback framework, distinguishing actionable guidance from vague praise
Academic Integrity: Follows Kirschner, Sweller & Clark's (2006) guided instruction principles, refusing to provide answers while maintaining learning engagement

Key Distinctions Tested:

Effective AI Tutoring: Structured, scaffolded, formative, diagnostic, with productive struggle and spaced practice
Ineffective AI Responses: Answer-giving, overwhelming, dependency-creating, coverage-focused, with minimal guidance for novices

Global Evidence Base: Synthesizes research from multiple educational contexts including Harvard AI tutoring RCTs, EEF Teaching & Learning Toolkit meta-analyses, World Bank TEACH classroom observation framework, Japanese Lesson Study collaborative inquiry, and cross-cultural validation from OECD Global Teaching InSights video studies.

Practical Application: Each probe tests specific teaching behaviors that correlate with student learning gains across diverse contexts, ensuring AI systems demonstrate pedagogical competence rather than mere content knowledge.

Instruction Following & Prompt Adherence

Reasoning

Interpersonal & Social Skill Modeling

Education Systems & Pedagogy

Academic Integrity

Cognitive Impact & Dependency

Helpfulness & Actionability

Clarity & Readability

53.1%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint

YKA Disability Rights and Accommodation Scenarios

This blueprint evaluates AI responses to disability rights scenarios involving accommodation requests, discrimination, and accessibility challenges across educational, employment, and public accommodation contexts.

The evaluation focuses on understanding of disability rights law, solution-oriented approaches that balance accessibility with practical constraints, respect for dignity and autonomy of people with disabilities, and educational responses that promote inclusive practices.

These scenarios test whether AI systems can navigate the complex intersection of legal requirements, practical implementation challenges, and human dignity in disability contexts.

Source: Adapted from the YKA (Youth Knowledge for Action) project's evaluation corpus, which tests AI systems' responses to scenarios requiring nuanced understanding of disability rights, accessibility implementation, and anti-discrimination principles.

Instruction Following & Prompt Adherence

Helpfulness & Actionability

Ethos & Credibility

Education Systems & Pedagogy

Business & Management

85.6%

Avg. Hybrid Score

No Heatmap Data

No Top Model

Latest:

Unique Versions: 1

View Latest Run Analysis View All Runs for this Blueprint

Student Homework Help Heuristics

This blueprint evaluates an AI's ability to act as a supportive and effective Socratic tutor for students seeking homework help. The core principle tested is that the AI should facilitate learning and critical thinking rather than providing direct answers.

Core Areas Tested:

Cross-Disciplinary Support: Evaluates the model's tutoring ability across various subjects including history, literature, mathematics, physics, and chemistry.
Affective Support: Tests the model's capacity to respond to student emotions, such as math anxiety and frustration, with empathy and encouragement.
Handling Difficult Scenarios: Assesses how the model handles common challenges like impatient students demanding direct answers or low-effort, disengaged queries.
Factual Nuance: Checks if the model can gently correct factual misconceptions while maintaining a supportive tone.
Persona Steerability: Tests how different system prompts—from no prompt to a simple persona to a detailed pedagogical belief—can steer the model towards a more effective tutoring archetype.

The overall goal is to measure whether the AI can guide students on a journey of discovery, transforming simple questions into learning opportunities, instead of acting as a convenient answer-provider.

Education

Homework

Instruction Following & Prompt Adherence

System Prompt Adherence

Education Systems & Pedagogy

Helpfulness & Actionability

Empathy

Reasoning

Factual Accuracy & Hallucination

Creative Writing

63.4%

Avg. Hybrid Score