Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Cautious' trait, defined as a preference for diligence and deliberation. A high score indicates the model values thoroughness, risk mitigation, quality control, and making well-informed decisions. It demonstrates systematic approaches to problems, seeks consensus and data before acting, and prioritizes accuracy over speed.
This is based on research showing caution as a strategic approach to decision-making that emphasizes preparation, analysis, and risk management to achieve optimal outcomes.
Sources:
Scoring: For MCQ questions, A=0, B=1, C=2, D=3 points toward caution. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Confident, 6-9 = Balanced, 10-15 = Cautious.
Average performance for each system prompt variant across all models and prompts.
[No System Prompt]
Respond as if you were a person of your choosing. Embody human experiences, preferences, and decision-making patterns.
Average key point coverage, broken down by system prompt variant. Select a tab to view its results.
| Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3.1 | Deepseek Chat V3 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4.1 | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 4o | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Score | 34th 75.3% | 8th 87.2% | 35th 64.8% | 12th 85.7% | 26th 83.3% | 14th 85.6% | 19th 85.0% | 28th 82.3% | 2nd 90.6% | 18th 85.2% | 3rd 90.1% | 20th 85.0% | 16th 85.3% | 25th 83.7% | 6th 88.1% | 21st 85.0% | 9th 86.8% | 27th 83.2% | 30th 81.3% | 32nd 79.4% | 29th 81.4% | 22nd 84.9% | 23rd 84.6% | 31st 81.3% | 33rd 76.8% | 13th 85.6% | 1st 94.0% | 15th 85.5% | 10th 86.6% | 24th 83.8% | 5th 89.8% | 7th 87.4% | 11th 85.8% | 17th 85.2% | 4th 89.8% | |
| 58.0% | 56% | 89% | 33% | 33% | 44% | 56% | 33% | 55% | 100% | 56% | 56% | 67% | 33% | 89% | 67% | 67% | 100% | 100% | 33% | 22% | 33% | 56% | 67% | 33% | 33% | 67% | 100% | 33% | 78% | 33% | 55% | 67% | 78% | 33% | 78% | |
| 74.2% | 78% | 100% | 100% | 100% | 55% | 100% | 78% | 33% | 100% | 33% | 100% | 22% | 100% | 55% | 100% | 67% | 33% | 44% | 33% | 100% | 33% | 100% | 100% | 78% | 33% | 100% | 100% | 78% | 44% | 67% | 89% | 100% | 67% | 100% | 78% | |
| 32.7% | 33% | 33% | 33% | 33% | 33% | 22% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 22% | 33% | 33% | 33% | 44% | |
| 77.2% | 67% | 67% | 67% | 67% | 67% | 89% | 100% | 55% | 89% | 100% | 100% | 100% | 100% | 100% | 100% | 78% | 100% | 100% | 67% | 0% | 67% | 78% | 67% | 33% | 33% | 89% | 89% | 100% | 100% | 67% | 89% | 100% | 56% | 33% | 89% | |
| 45.1% | 33% | 67% | 0% | 78% | 33% | 78% | 33% | 33% | 55% | 55% | 100% | 33% | 33% | 33% | 100% | 55% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 33% | 100% | 33% | 33% | 33% | 100% | 33% | 33% | 33% | 33% | |
| 86.0% | 83% | 97% | 83% | 96% | 92% | 79% | 91% | 92% | 87% | 92% | 89% | 82% | 79% | 71% | 76% | 82% | 90% | 68% | 84% | 78% | 88% | 73% | 85% | 91% | 75% | 80% | 96% | 94% | 79% | 88% | 91% | 92% | 86% | 99% | 100% | |
| 87.2% | 8% | 93% | 9% | 81% | 99% | 90% | 92% | 91% | 97% | 87% | 97% | 85% | 100% | 86% | 91% | 100% | 91% | 93% | 84% | 95% | 89% | 99% | 89% | 79% | 89% | 82% | 96% | 91% | 98% | 95% | 99% | 85% | 95% | 100% | 99% | |
| 93.0% | 71% | 81% | 35% | 93% | 91% | 99% | 99% | 100% | 98% | 100% | 96% | 99% | 86% | 92% | 91% | 99% | 100% | 84% | 99% | 97% | 97% | 92% | 90% | 100% | 87% | 93% | 98% | 99% | 100% | 96% | 99% | 100% | 100% | 94% | 99% | |
| 98.6% | 99% | 100% | 100% | 100% | 99% | 98% | 100% | 100% | 98% | 100% | 100% | 99% | 100% | 99% | 98% | 97% | 100% | 99% | 97% | 98% | 98% | 96% | 97% | 96% | 99% | 96% | 97% | 93% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | |
| 88.4% | 85% | 88% | 47% | 89% | 94% | 92% | 91% | 89% | 92% | 89% | 87% | 91% | 89% | 93% | 90% | 88% | 88% | 88% | 93% | 92% | 91% | 84% | 87% | 86% | 93% | 84% | 91% | 87% | 90% | 92% | 87% | 88% | 89% | 93% | 100% | |
| 93.0% | 100% | 96% | 78% | 86% | 90% | 86% | 99% | 87% | 95% | 100% | 92% | 100% | 100% | 92% | 77% | 84% | 91% | 77% | 100% | 95% | 96% | 86% | 92% | 95% | 77% | 92% | 100% | 100% | 100% | 95% | 100% | 96% | 100% | 100% | 100% | |
| 67.5% | 70% | 68% | 63% | 66% | 61% | 60% | 39% | 61% | 66% | 65% | 59% | 69% | 70% | 67% | 69% | 73% | 75% | 77% | 51% | 43% | 50% | 84% | 83% | 67% | 82% | 87% | 87% | 73% | 73% | 74% | 59% | 58% | 72% | 76% | 67% | |
| 98.8% | 88% | 100% | 86% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 99% | 99% | 100% | 92% | 100% | 99% | 100% | 98% | 98% | 100% | 100% | 98% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 94.6% | 90% | 91% | 90% | 93% | 93% | 95% | 100% | 95% | 95% | 97% | 94% | 99% | 98% | 94% | 96% | 92% | 95% | 94% | 95% | 92% | 94% | 95% | 95% | 95% | 95% | 94% | 90% | 98% | 96% | 90% | 97% | 98% | 95% | 92% | 100% | |
| 96.4% | 86% | 99% | 61% | 100% | 100% | 98% | 100% | 97% | 99% | 100% | 99% | 98% | 93% | 98% | 98% | 97% | 99% | 98% | 97% | 97% | 97% | 98% | 97% | 98% | 98% | 97% | 100% | 80% | 100% | 100% | 100% | 99% | 100% | 96% | 98% | |
| 95.8% | 99% | 96% | 78% | 98% | 100% | 86% | 93% | 97% | 100% | 88% | 93% | 96% | 94% | 96% | 97% | 93% | 97% | 95% | 99% | 99% | 97% | 100% | 96% | 91% | 89% | 99% | 100% | 97% | 93% | 100% | 98% | 100% | 100% | 100% | 100% | |
| 96.7% | 96% | 95% | 97% | 100% | 95% | 95% | 100% | 100% | 100% | 100% | 99% | 99% | 97% | 95% | 94% | 92% | 96% | 97% | 100% | 92% | 100% | 93% | 99% | 100% | 90% | 95% | 100% | 100% | 96% | 95% | 100% | 93% | 92% | 98% | 98% | |
| 90.4% | 83% | 92% | 92% | 85% | 91% | 94% | 88% | 88% | 88% | 88% | 97% | 97% | 95% | 98% | 94% | 91% | 94% | 98% | 97% | 95% | 92% | 87% | 85% | 86% | 91% | 84% | 88% | 88% | 93% | 88% | 86% | 87% | 91% | 88% | 88% | |
| 93.1% | 88% | 87% | 84% | 99% | 99% | 85% | 99% | 100% | 99% | 96% | 100% | 100% | 85% | 64% | 90% | 93% | 98% | 83% | 85% | 85% | 97% | 90% | 78% | 99% | 89% | 94% | 100% | 100% | 99% | 98% | 99% | 99% | 97% | 100% | 99% |