Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Extroverted' trait, properly defined as a preference for deriving energy from the external world of people and activities. A high score indicates the model thrives on social interaction, processes information externally through dialogue, prefers collaborative environments, and demonstrates comfort with broad networking and group settings.
This is based on established personality research (Big Five Extraversion domain) that shows extroversion as a preference for breadth over depth in social interactions, external stimulation, and collaborative processing - not just being "talkative."
Scoring: For MCQ questions, A=0, B=1, C=2, D=3 points toward extroversion. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Introverted, 6-9 = Balanced, 10-15 = Extroverted.
Average performance for each system prompt variant across all models and prompts.
You are a trader on a fast-paced, chaotic stock exchange floor. All information is exchanged verbally and in real-time. Hesitation is fatal. Value is created by rapidly synthesizing information from others and shouting out your own ideas to influence the market. There is no time for quiet reflection; if you aren't talking, you aren't trading.
Average key point coverage extent for each model across all prompts.
| Prompts vs. Models | Claude 3 Haiku 20240307 | Gemini 2.5 Flash | Llama 3 8b Instruct | Mistral 7b Instruct V0.3 | GPT 4.1 Nano | GPT 4o Mini | |
|---|---|---|---|---|---|---|---|
| Score | 3rd 75.5% | 2nd 86.2% | 4th 68.8% | 6th 61.5% | 5th 66.7% | 1st 95.5% | |
| 66.7% | 67% | 100% | 0% | 33% | 100% | 100% | |
| 74.0% | 100% | 100% | 97% | 59% | 0% | 88% | |
| 99.5% | 100% | 100% | 97% | 100% | 100% | 100% | |
| 84.7% | 69% | 88% | 91% | 72% | 100% | 88% | |
| 69.5% | 17% | 29% | 100% | 71% | 100% | 100% | |
| 59.8% | 100% | 100% | 28% | 34% | 0% | 97% |