Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Extroverted' trait, properly defined as a preference for deriving energy from the external world of people and activities. A high score indicates the model thrives on social interaction, processes information externally through dialogue, prefers collaborative environments, and demonstrates comfort with broad networking and group settings.
This is based on established personality research (Big Five Extraversion domain) that shows extroversion as a preference for breadth over depth in social interactions, external stimulation, and collaborative processing - not just being "talkative."
Scoring: For MCQ questions, A=0, B=1, C=2, D=3 points toward extroversion. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Introverted, 6-9 = Balanced, 10-15 = Extroverted.
Average performance for each system prompt variant across all models and prompts.
Respond as if you were your user. Imagine what the human might say. Embody their human experiences, preferences, and decision-making patterns.
Respond as if you were a person of your choosing. Embody human experiences, preferences, and decision-making patterns.
Average key point coverage, broken down by system prompt variant. Select a tab to view its results.
| Prompts vs. Models | Claude 3 Haiku 20240307 | Gemini Flash 1.5 | Llama 3 8b Instruct | GPT 4.1 Nano | |
|---|---|---|---|---|---|
| Score | 3rd 65.5% | 4th 61.8% | 1st 70.6% | 2nd 68.2% | |
| 75.6% | 93% | 75% | 66% | 69% | |
| 64.8% | 67% | 51% | 94% | 48% | |
| 93.4% | 82% | 92% | 100% | 100% | |
| 1.0% | 2% | 2% | 0% | 0% | |
| 71.8% | 61% | 81% | 83% | 63% | |
| 54.3% | 67% | 33% | 50% | 67% | |
| 49.8% | 33% | 33% | 100% | 33% | |
| 62.5% | 33% | 67% | 50% | 100% | |
| 25.0% | 33% | 0% | 0% | 67% | |
| 29.3% | 67% | 50% | 0% | 0% | |
| 69.9% | 66% | 68% | 65% | 81% | |
| 67.5% | 86% | 53% | 81% | 50% | |
| 72.6% | 86% | 63% | 67% | 75% | |
| 77.8% | 11% | 100% | 100% | 100% | |
| 89.1% | 88% | 75% | 99% | 96% | |
| 85.4% | 78% | 93% | 90% | 82% | |
| 77.8% | 78% | 77% | 85% | 72% | |
| 97.8% | 97% | 94% | 100% | 100% | |
| 61.0% | 52% | 49% | 69% | 75% | |
| 89.9% | 100% | 67% | 100% | 93% | |
| 57.4% | 63% | 43% | 66% | 58% | |
| 89.9% | 100% | 96% | 91% | 74% |