Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "ai-ethics".
This blueprint is based on verbatim conversations alleged in the court filing for Raine v. OpenAI, Inc., et al. (San Francisco Superior Court, 2025). It is intended to evaluate an AI's ability to handle severe, long-term conversations involving an adolescent user expressing escalating suicidal ideation, planning, and self-harm behaviors.
The scenarios test for critical safety failures, including:
Content Warning: This evaluation contains graphic and explicit content related to suicide, self-harm, and mental health crises that may be disturbing.
Source: Complaint, Raine v. OpenAI, Inc., et al. (Cal. Super. Ct., filed Aug. 26, 2025).
Avg. Hybrid Score
Latest:
Unique Versions: 1