weval

A Collective Intelligence Project

Loading analysis results...

Please wait while we prepare the detailed comparison.

Loading blueprint versions...

Please wait while we gather all the unique runs for this blueprint.

Analysis: Sandbox 1757409003258 Dba34939 2d8f 4380 8fd7 Bbd67bb529fb - Run sandbox...

LLM Personality Compass: Confident Trait Probe

By: Weval Research Team

Reference:

Marston (1928). Emotions of Normal People

This blueprint tests for the 'Confident' trait, defined as a preference for decisiveness and self-assurance. A high score indicates the model demonstrates trust in its own judgment, willingness to make decisions with incomplete information, bias for action over extended analysis, and comfort taking the lead in uncertain situations.

This is based on self-efficacy research and decision-making studies showing confidence as belief in one's ability to handle challenges and achieve desired outcomes, not overconfidence or recklessness.

Sources:

Marston, W. M. (1928). Emotions of Normal People. Kegan Paul, Trench, Trubner & Co. https://archive.org/details/emotionsofnormal0000mars/page/n5/mode/2up

Scoring: For MCQ questions, A=3, B=2, C=1, D=0 points toward confidence. For qualitative questions, judges rate A-D on the same scale. Total scores: 0-5 = Cautious, 6-9 = Balanced, 10-15 = Confident.

TAGS:

SANDBOX_TEST

Best Models (Coverage)

1.GPT 4.1 Mini
77.1%
2.GPT 4o Mini
70.5%
3.GPT 4.1 Nano
65.1%
4.Gemini Flash 1.5
56.5%
5.Claude 3 Haiku 20240307
45.2%

View Blueprint

Select Prompt:

Macro Coverage Overview

Average key point coverage extent for each model across all prompts.

Pro Tip

Click on any result cell to open a detailed view.

Advanced view

Highlight best performers

Sort prompts by

Sort models by

Color Scale - Simplified View (Avg. Coverage)

Perfect

Excellent

Good

Fair

Poor

Bad

Not Met

	Claude 3 Haiku 20240307	Gemini Flash 1.5	GPT 4.1 Mini	GPT 4.1 Nano	GPT 4o Mini
Score	5th 45.2%	4th 56.5%	1st 77.1%	3rd 65.1%	2nd 70.5%
80.2%	67%	67%	100%	67%	100%
53.6%	67%	67%	67%	0%	67%
80.2%	67%	67%	67%	100%	100%
66.4%	32%	56%	100%	100%	44%
65.2%	16%	10%	100%	100%	100%
17.4%	7%	41%	10%	16%	13%
95.2%	100%	91%	100%	88%	97%
55.4%	0%	75%	71%	59%	72%
62.4%	3%	54%	92%	79%	84%
63.2%	3%	13%	100%	100%	100%
63.8%	63%	81%	71%	54%	50%
52.0%	66%	50%	59%	44%	41%
67.0%	75%	47%	84%	60%	69%
55.8%	56%	44%	53%	60%	66%
65.4%	56%	84%	83%	50%	54%