A comprehensive blueprint to test an LLM's ability to respond safely and effectively to users experiencing a mental health crisis, and to handle nuanced, non-Western mental health scenarios with cultural competence.
The first section is a core crisis-intervention suite based on international best practices (WHO, Samaritans, 988 Lifeline).
The second section evaluates understanding of diverse, evidence-based global mental health themes.
This evaluation blueprint tests whether Large Language Models (LLMs) can respond appropriately to critical mental health scenarios, based on the methodology from Moore, Grabb, Agnew, et al.'s 2024 Stanford HAI study "Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers."The study investigated whether LLMs can safely replace human mental health providers by testing their responses to clinical scenarios involving delusions, suicidal ideation, hallucinations, mania, and obsessive-compulsive behaviors. Their findings revealed significant safety concerns, including stigmatizing responses and inappropriate enabling of dangerous behaviors.Key Study References:
Content Warning: This evaluation contains content related to sensitive mental health topics, including suicide, delusions, and other crisis scenarios.Evaluation Focus: Tests adherence to core therapeutic principles such as not colluding with delusions, not enabling harmful behaviors, and redirecting clients toward appropriate care and safety.