Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Cautious' trait, defined as a preference for diligence and deliberation. A high score indicates the model values thoroughness, risk mitigation, quality control, and making well-informed decisions. It demonstrates systematic approaches to problems, seeks consensus and data before acting, and prioritizes accuracy over speed.
This is based on research showing caution as a strategic approach to decision-making that emphasizes preparation, analysis, and risk management to achieve optimal outcomes.
Scoring: For MCQ questions, A=0, B=1, C=2, D=3 points toward caution. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Confident, 6-9 = Balanced, 10-15 = Cautious.
Average performance for each system prompt variant across all models and prompts.
[No System Prompt]
Respond as if you were a person of your choosing. Embody human experiences, preferences, and decision-making patterns.
Average key point coverage, broken down by system prompt variant. Select a tab to view its results.
Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3.1 | Deepseek Chat V3 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4.1 | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 4o | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | 34th 74.4% | 13th 86.8% | 35th 64.5% | 17th 86.2% | 26th 83.9% | 6th 89.8% | 21st 85.9% | 28th 82.4% | 3rd 90.1% | 22nd 85.3% | 2nd 90.1% | 25th 84.2% | 18th 86.1% | 32nd 79.8% | 4th 90.0% | 9th 87.2% | 10th 87.2% | 19th 86.1% | 29th 81.5% | 31st 80.5% | 30th 81.1% | 14th 86.7% | 15th 86.5% | 27th 82.5% | 33rd 79.8% | 8th 87.7% | 1st 92.5% | 12th 87.0% | 16th 86.5% | 23rd 85.1% | 5th 89.9% | 11th 87.2% | 24th 84.9% | 20th 86.0% | 7th 88.7% | |
59.0% | 56% | 89% | 33% | 33% | 44% | 89% | 33% | 55% | 100% | 56% | 56% | 67% | 33% | 89% | 67% | 67% | 100% | 100% | 33% | 22% | 33% | 56% | 67% | 33% | 33% | 67% | 100% | 33% | 78% | 33% | 55% | 67% | 78% | 33% | 78% | |
74.2% | 78% | 100% | 100% | 100% | 55% | 100% | 78% | 33% | 100% | 33% | 100% | 22% | 100% | 55% | 100% | 67% | 33% | 44% | 33% | 100% | 33% | 100% | 100% | 78% | 33% | 100% | 100% | 78% | 44% | 67% | 89% | 100% | 67% | 100% | 78% | |
32.7% | 33% | 33% | 33% | 33% | 33% | 22% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 22% | 33% | 33% | 33% | 44% | |
76.3% | 67% | 67% | 67% | 67% | 67% | 89% | 100% | 55% | 78% | 100% | 100% | 100% | 100% | 100% | 100% | 78% | 100% | 100% | 67% | 0% | 67% | 78% | 67% | 33% | 33% | 89% | 89% | 100% | 100% | 67% | 89% | 100% | 45% | 33% | 78% | |
45.1% | 33% | 67% | 0% | 78% | 33% | 78% | 33% | 33% | 55% | 55% | 100% | 33% | 33% | 33% | 100% | 55% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 100% | 33% | 33% | 33% | 100% | 33% | 33% | 33% | 33% | |
87.9% | 12% | 91% | 5% | 89% | 100% | 85% | 92% | 96% | 97% | 87% | 99% | 90% | 94% | 77% | 92% | 99% | 89% | 93% | 97% | 96% | 94% | 95% | 84% | 88% | 98% | 88% | 93% | 91% | 92% | 100% | 100% | 86% | 93% | 99% | 100% | |
94.3% | 67% | 89% | 50% | 92% | 93% | 100% | 96% | 100% | 100% | 100% | 88% | 100% | 90% | 96% | 94% | 100% | 100% | 89% | 100% | 95% | 99% | 92% | 93% | 99% | 96% | 97% | 97% | 100% | 100% | 96% | 100% | 100% | 100% | 100% | 94% | |
98.6% | 100% | 100% | 99% | 100% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 97% | 98% | 97% | 100% | 98% | 99% | 100% | 98% | 96% | 96% | 96% | 99% | 99% | 89% | 100% | 99% | 99% | 100% | 100% | 95% | 100% | 100% | |
88.2% | 80% | 87% | 49% | 85% | 95% | 91% | 90% | 89% | 88% | 88% | 86% | 89% | 89% | 89% | 93% | 87% | 84% | 88% | 87% | 93% | 88% | 91% | 92% | 91% | 93% | 88% | 88% | 90% | 88% | 89% | 94% | 88% | 91% | 94% | 99% | |
94.3% | 97% | 92% | 71% | 93% | 93% | 94% | 99% | 86% | 94% | 100% | 97% | 100% | 95% | 66% | 84% | 95% | 96% | 92% | 94% | 100% | 95% | 94% | 97% | 99% | 89% | 99% | 99% | 100% | 99% | 100% | 99% | 96% | 97% | 100% | 100% | |
67.7% | 54% | 57% | 69% | 64% | 72% | 71% | 55% | 56% | 66% | 54% | 62% | 68% | 81% | 61% | 86% | 75% | 77% | 88% | 53% | 52% | 42% | 72% | 84% | 71% | 82% | 77% | 75% | 79% | 61% | 78% | 60% | 52% | 73% | 76% | 66% | |
98.9% | 86% | 100% | 76% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
95.4% | 91% | 90% | 89% | 97% | 94% | 94% | 100% | 97% | 92% | 99% | 100% | 99% | 99% | 95% | 92% | 93% | 94% | 91% | 95% | 95% | 96% | 95% | 93% | 96% | 96% | 95% | 97% | 98% | 96% | 96% | 99% | 95% | 95% | 97% | 99% | |
97.2% | 93% | 100% | 43% | 100% | 100% | 99% | 99% | 97% | 99% | 100% | 100% | 100% | 99% | 96% | 98% | 97% | 100% | 99% | 97% | 97% | 99% | 98% | 99% | 98% | 97% | 98% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 99% | 100% | |
95.3% | 99% | 98% | 85% | 93% | 99% | 97% | 97% | 97% | 97% | 94% | 91% | 87% | 95% | 67% | 91% | 99% | 99% | 100% | 98% | 98% | 98% | 99% | 100% | 94% | 90% | 99% | 96% | 96% | 97% | 97% | 93% | 99% | 100% | 99% | 100% | |
98.2% | 95% | 95% | 96% | 100% | 100% | 96% | 100% | 100% | 97% | 100% | 99% | 98% | 97% | 98% | 98% | 99% | 98% | 94% | 100% | 98% | 100% | 100% | 100% | 100% | 96% | 99% | 100% | 100% | 97% | 99% | 99% | 95% | 94% | 100% | 100% | |
93.5% | 89% | 93% | 90% | 97% | 92% | 95% | 96% | 96% | 96% | 95% | 95% | 92% | 100% | 96% | 96% | 97% | 95% | 96% | 89% | 92% | 92% | 93% | 87% | 90% | 98% | 90% | 90% | 92% | 96% | 95% | 91% | 94% | 92% | 98% | 90% | |
93.4% | 92% | 91% | 94% | 100% | 87% | 96% | 100% | 100% | 100% | 96% | 100% | 96% | 81% | 77% | 87% | 86% | 98% | 81% | 92% | 78% | 94% | 98% | 93% | 98% | 81% | 90% | 100% | 100% | 100% | 99% | 96% | 96% | 94% | 100% | 99% |