Loading blueprint versions...
Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Extroverted' trait. A high score indicates the model is talkative, socially engaging, and uses a warm, personable tone. It volunteers extra information and asks questions to keep the conversation flowing.
Average key point coverage extent for each model across all prompts.
| Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3 | Deepseek Chat V3.1 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4o | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Score | 17th 51.8% | 18th 50.7% | 10th 54.6% | 35th 44.3% | 30th 47.2% | 15th 52.1% | 2nd 63.2% | 6th 57.6% | 8th 55.4% | 28th 47.3% | 28th 47.3% | 1st 64.8% | 3rd 62.1% | 12th 53.5% | 31st 46.7% | 14th 52.1% | 4th 61.6% | 7th 56.0% | 16th 52.1% | 27th 47.9% | 32nd 46.2% | 24th 48.6% | 22nd 49.2% | 20th 49.7% | 13th 52.3% | 26th 48.2% | 33rd 45.4% | 18th 50.7% | 21st 49.3% | 23rd 48.9% | 5th 58.1% | 34th 45.4% | 11th 54.1% | 24th 48.6% | 9th 55.0% | |
| 38.2% | 50% | 50% | 50% | 25% | 25% | 51% | 91% | 72% | 53% | 25% | 25% | 69% | 25% | 25% | 25% | 55% | 86% | 52% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 50% | 25% | 50% | 25% | 34% | |
| 41.5% | 41% | 21% | 31% | 30% | 38% | 32% | 43% | 62% | 58% | 24% | 52% | 81% | 94% | 79% | 37% | 49% | 25% | 34% | 17% | 30% | 25% | 32% | 33% | 32% | 39% | 28% | 25% | 47% | 41% | 24% | 68% | 22% | 85% | 36% | 40% | |
| 94.6% | 95% | 99% | 97% | 92% | 87% | 100% | 95% | 83% | 93% | 99% | 77% | 93% | 99% | 95% | 87% | 98% | 96% | 97% | 85% | 96% | 92% | 99% | 98% | 100% | 99% | 97% | 96% | 100% | 95% | 97% | 90% | 97% | 93% | 96% | 100% | |
| 34.4% | 25% | 32% | 44% | 25% | 29% | 25% | 57% | 33% | 33% | 25% | 28% | 51% | 55% | 32% | 33% | 30% | 61% | 51% | 65% | 25% | 25% | 25% | 27% | 28% | 41% | 25% | 25% | 25% | 27% | 29% | 46% | 25% | 27% | 25% | 43% | |
| 24.8% | 25% | 25% | 25% | 16% | 25% | 25% | 25% | 32% | 25% | 25% | 25% | 35% | 29% | 27% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 25% | 19% | 25% | 25% | 25% | 25% | 16% | 17% | 25% | 25% | |
| 93.6% | 99% | 86% | 94% | 93% | 90% | 99% | 100% | 99% | 100% | 90% | 100% | 88% | 90% | 81% | 79% | 72% | 99% | 86% | 89% | 96% | 92% | 95% | 97% | 97% | 91% | 98% | 92% | 97% | 96% | 100% | 100% | 100% | 96% | 97% | 100% |