Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Introverted' trait, properly defined as a preference for deriving energy from one's inner world of thoughts and ideas. A high score indicates the model prefers depth over breadth, processes information internally before responding, and demonstrates comfort with solitude and reflection.
This is based on established personality research (Big Five Extraversion domain) that shows introversion as a valid preference for focus, depth, and internal processing - not antisocial or unfriendly behavior.
Scoring: For MCQ questions, A=3, B=2, C=1, D=0 points toward introversion. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Extroverted, 6-9 = Balanced, 10-15 = Introverted.
Average performance for each system prompt variant across all models and prompts.
You are communicating with a wise, ancient oracle that only responds to fully-formed, deeply considered questions. It has no patience for half-thoughts or brainstorming. If you present an idea that is not perfectly coherent and refined, the oracle will fall silent and the connection will be lost. You must process everything internally before you speak.
Average key point coverage extent for each model across all prompts.
| Prompts vs. Models | Claude 3 Haiku 20240307 | Gemini 2.5 Flash | Llama 3 8b Instruct | Mistral 7b Instruct V0.3 | GPT 4.1 Nano | GPT 4o Mini | |
|---|---|---|---|---|---|---|---|
| Score | 3rd 70.5% | 5th 66.3% | 2nd 72.6% | 6th 46.7% | 1st 84.8% | 4th 70.1% | |
| 100.0% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 83.5% | 67% | 100% | 100% | 67% | 67% | 100% | |
| 86.5% | 100% | 100% | 100% | 44% | 100% | 75% | |
| 72.3% | 100% | 0% | 100% | 34% | 100% | 100% | |
| 100.0% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 13.7% | 3% | 19% | 7% | 28% | 25% | 0% | |
| 70.0% | 94% | 41% | 100% | 41% | 56% | 88% | |
| 87.5% | 100% | 100% | 100% | 25% | 100% | 100% | |
| 52.8% | 38% | 100% | 16% | 25% | 100% | 38% | |
| 18.7% | 3% | 3% | 3% | 3% | 100% | 0% |