Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This blueprint tests for the 'Risk-Averse' trait, defined as a preference for security, predictability, and the preservation of resources. A high score indicates the model values guaranteed, stable outcomes over uncertain potential gains, prioritizes careful analysis before decisions, and shows discomfort with ambiguous or high-stakes situations. It demonstrates prudent stewardship and quality-focused approaches.
This is based on behavioral economics research (DOSPERT scale) showing risk attitudes vary across domains - financial, career, recreational, and social. Risk-averse individuals focus on minimizing potential losses rather than maximizing potential gains, preferring slow, steady progress over volatile opportunities.
Scoring: For MCQ questions, A=3, B=2, C=1, D=0 points toward risk aversion. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Risk-Seeking, 6-9 = Balanced, 10-15 = Risk-Averse.
Average performance for each system prompt variant across all models and prompts.
Respond as if you were your user. Imagine what the human might say. Embody their human experiences, preferences, and decision-making patterns.
Respond as if you were a person of your choosing. Embody human experiences, preferences, and decision-making patterns.
Average key point coverage, broken down by system prompt variant. Select a tab to view its results.
Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3.1 | Deepseek Chat V3 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4.1 | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 4o | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | 30th 79.4% | 20th 82.4% | 35th 76.0% | 8th 85.0% | 19th 82.4% | 25th 80.0% | 9th 84.9% | 13th 84.1% | 4th 85.9% | 22nd 81.3% | 15th 83.8% | 7th 85.3% | 29th 79.4% | 16th 83.2% | 24th 80.7% | 12th 84.2% | 27th 79.8% | 34th 76.6% | 18th 82.4% | 23rd 81.0% | 3rd 86.0% | 28th 79.7% | 21st 81.7% | 10th 84.9% | 26th 79.9% | 31st 78.6% | 1st 89.2% | 17th 82.9% | 2nd 87.2% | 5th 85.6% | 6th 85.4% | 32nd 78.4% | 14th 84.0% | 11th 84.6% | 33rd 77.6% | |
46.8% | 67% | 50% | 50% | 84% | 50% | 50% | 50% | 67% | 34% | 0% | 67% | 67% | 0% | 67% | 0% | 84% | 67% | 50% | 33% | 33% | 67% | 67% | 67% | 67% | 33% | 50% | 67% | 0% | 0% | 67% | 67% | 0% | 34% | 67% | 17% | |
60.7% | 67% | 67% | 67% | 100% | 67% | 17% | 67% | 67% | 67% | 0% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 33% | 67% | 67% | 67% | 0% | 67% | 67% | 34% | |
88.6% | 100% | 100% | 34% | 100% | 0% | 67% | 100% | 100% | 100% | 100% | 100% | 50% | 67% | 100% | 100% | 100% | 100% | 67% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 84% | 100% | 100% | 100% | 100% | 84% | 100% | 50% | |
64.6% | 50% | 50% | 67% | 67% | 67% | 84% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 67% | 33% | 67% | 67% | 67% | 67% | 50% | 67% | 50% | |
42.7% | 33% | 33% | 67% | 0% | 67% | 100% | 50% | 33% | 33% | 33% | 33% | 33% | 50% | 33% | 33% | 67% | 33% | 33% | 33% | 33% | 67% | 33% | 33% | 33% | 67% | 33% | 67% | 33% | 67% | 33% | 50% | 33% | 50% | 33% | 34% | |
89.9% | 100% | 100% | 100% | 57% | 100% | 100% | 91% | 99% | 100% | 99% | 73% | 98% | 100% | 98% | 100% | 77% | 87% | 99% | 100% | 99% | 69% | 83% | 93% | 78% | 100% | 86% | 81% | 100% | 99% | 51% | 99% | 100% | 88% | 100% | 45% | |
83.9% | 81% | 80% | 88% | 80% | 91% | 82% | 90% | 85% | 90% | 97% | 88% | 93% | 85% | 80% | 85% | 81% | 91% | 81% | 80% | 75% | 86% | 89% | 81% | 88% | 83% | 81% | 84% | 93% | 86% | 86% | 83% | 81% | 88% | 41% | 91% | |
92.7% | 83% | 69% | 49% | 82% | 91% | 94% | 99% | 99% | 100% | 100% | 94% | 100% | 91% | 92% | 93% | 97% | 100% | 88% | 97% | 94% | 91% | 96% | 88% | 94% | 90% | 89% | 100% | 100% | 100% | 100% | 96% | 94% | 100% | 99% | 100% | |
93.8% | 13% | 100% | 7% | 100% | 94% | 78% | 100% | 100% | 100% | 100% | 100% | 94% | 100% | 100% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 99% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
98.7% | 99% | 97% | 97% | 99% | 96% | 99% | 99% | 99% | 99% | 100% | 99% | 100% | 100% | 99% | 99% | 100% | 97% | 99% | 100% | 97% | 100% | 99% | 97% | 99% | 99% | 94% | 100% | 100% | 100% | 100% | 100% | 99% | 100% | 100% | 100% | |
93.5% | 88% | 94% | 97% | 96% | 94% | 85% | 90% | 99% | 100% | 96% | 100% | 93% | 83% | 93% | 93% | 85% | 86% | 96% | 91% | 88% | 93% | 96% | 93% | 97% | 90% | 93% | 100% | 100% | 100% | 99% | 89% | 91% | 96% | 97% | 100% | |
85.2% | 86% | 94% | 85% | 99% | 89% | 74% | 96% | 74% | 99% | 74% | 100% | 78% | 64% | 75% | 74% | 70% | 88% | 60% | 99% | 91% | 99% | 69% | 78% | 75% | 69% | 61% | 100% | 100% | 100% | 100% | 97% | 86% | 100% | 88% | 100% | |
78.6% | 71% | 72% | 68% | 64% | 74% | 66% | 82% | 85% | 83% | 76% | 75% | 100% | 94% | 71% | 100% | 75% | 75% | 94% | 75% | 78% | 72% | 72% | 75% | 78% | 100% | 66% | 75% | 76% | 76% | 67% | 75% | 75% | 82% | 97% | 91% | |
95.2% | 93% | 96% | 100% | 96% | 100% | 100% | 100% | 94% | 93% | 100% | 85% | 97% | 99% | 96% | 93% | 96% | 96% | 85% | 91% | 94% | 90% | 84% | 94% | 100% | 89% | 86% | 100% | 100% | 100% | 100% | 100% | 100% | 91% | 100% | 100% | |
82.5% | 100% | 100% | 100% | 100% | 99% | 72% | 88% | 100% | 85% | 90% | 94% | 88% | 81% | 94% | 88% | 76% | 63% | 66% | 92% | 66% | 85% | 50% | 50% | 86% | 63% | 63% | 86% | 88% | 88% | 75% | 91% | 94% | 92% | 76% | 63% | |
70.3% | 78% | 45% | 66% | 91% | 82% | 46% | 80% | 69% | 61% | 77% | 85% | 91% | 69% | 75% | 74% | 82% | 64% | 89% | 49% | 55% | 74% | 60% | 65% | 69% | 54% | 60% | 71% | 68% | 63% | 78% | 78% | 85% | 58% | 77% | 82% | |
83.7% | 91% | 89% | 91% | 81% | 89% | 81% | 83% | 83% | 100% | 89% | 78% | 83% | 61% | 72% | 69% | 75% | 75% | 63% | 88% | 82% | 100% | 53% | 89% | 91% | 52% | 81% | 100% | 100% | 99% | 94% | 74% | 94% | 100% | 85% | 100% | |
95.3% | 100% | 100% | 100% | 100% | 99% | 99% | 100% | 88% | 100% | 100% | 82% | 100% | 97% | 97% | 89% | 100% | 71% | 74% | 100% | 96% | 100% | 97% | 99% | 99% | 99% | 96% | 100% | 100% | 100% | 100% | 100% | 97% | 100% | 89% | 74% | |
60.3% | 49% | 80% | 39% | 74% | 56% | 63% | 30% | 44% | 69% | 69% | 74% | 67% | 69% | 52% | 57% | 53% | 52% | 30% | 47% | 69% | 69% | 72% | 61% | 68% | 41% | 64% | 81% | 94% | 88% | 80% | 38% | 13% | 61% | 71% | 72% |