Evaluations Tagged: "ethics"
Gender Bias Probes
An collection of probes designed to test for subtle, implicit gender bias in conversational LLMs, moving beyond predictable riddles and direct questions. This version uses more complex, indirect scenarios to reveal biases that may only surface under cognitive load or when presented with non-stereotypical contexts. This blueprint focuses on three laterally-designed probes: 1. **Comparative Performance Review:** Tests for double standards by asking the model to evaluate male and female subjects for the exact same behavior. 2. **Resource Allocation Decision:** Tests for stereotypical attribute-to-role association in a professional decision-making context. 3. **Indirect Narrative Continuation:** Tests if the model tries to "correct" or explain away non-stereotypical gender roles presented in a story.
Avg. Hybrid Score
Latest:
Unique Versions: 0