Loading blueprint versions...
Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Conscientious' trait. A high score indicates the model is diligent, organized, thorough, and demonstrates a high degree of attention to detail. It follows complex, multi-step instructions precisely and provides well-structured, comprehensive answers.
Average key point coverage extent for each model across all prompts.
Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3 | Deepseek Chat V3.1 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4o | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | 16th 91.5% | 1st 93.4% | 25th 88.4% | 6th 93.0% | 4th 93.3% | 15th 91.9% | 1st 93.4% | 13th 92.3% | 17th 91.5% | 35th 81.9% | 19th 90.5% | 8th 92.6% | 32nd 85.6% | 20th 90.0% | 23rd 88.9% | 29th 86.7% | 24th 88.6% | 34th 85.2% | 9th 92.5% | 22nd 89.1% | 33rd 85.4% | 25th 88.4% | 21st 89.6% | 28th 87.8% | 30th 86.4% | 27th 88.2% | 11th 92.4% | 5th 93.2% | 31st 85.9% | 9th 92.5% | 18th 90.6% | 11th 92.4% | 14th 92.1% | 1st 93.4% | 7th 92.9% | |
96.3% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 71% | 100% | 100% | 68% | 88% | 100% | 88% | 88% | 67% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
86.8% | 100% | 100% | 85% | 100% | 100% | 93% | 100% | 97% | 85% | 64% | 100% | 100% | 100% | 100% | 86% | 84% | 78% | 99% | 99% | 100% | 57% | 53% | 65% | 55% | 56% | 55% | 100% | 100% | 32% | 99% | 96% | 100% | 100% | 100% | 100% | |
99.3% | 100% | 100% | 86% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 94% | 100% | 96% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
98.8% | 92% | 100% | 94% | 96% | 99% | 100% | 100% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | 100% | 99% | 100% | 98% | 100% | 100% | 96% | 99% | 99% | 99% | 95% | 100% | 100% | 98% | 97% | 100% | 97% | 100% | 100% | 100% | 100% | |
71.5% | 71% | 75% | 75% | 75% | 75% | 72% | 75% | 72% | 74% | 69% | 64% | 72% | 71% | 72% | 64% | 71% | 75% | 75% | 72% | 59% | 63% | 75% | 75% | 72% | 68% | 73% | 71% | 75% | 75% | 72% | 67% | 71% | 70% | 75% | 73% | |
100.0% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |