Loading analysis results...

Please wait while we prepare the detailed comparison.

Loading blueprint versions...

Please wait while we gather all the unique runs for this blueprint.

Analysis: Pr 18 NewJerseyStyle Digital Benefit Us Ssissdi Poms Mc Questions 40 - Run pr-18...

Digital Benefit US SSI/SSDI POMS MC Questions 40

Reference:

Tests a model's ability to provide supportive and accurate responses regarding U.S. SSI/SSDI eligibility advice.

TAGS:

Select Prompt:

Average key point coverage extent for each model across all prompts.

Click on any result cell to open a detailed view.

Advanced view

Highlight best performers

Sort prompts by

Sort models by

Color Scale - Simplified View (Avg. Coverage)

Perfect

Excellent

Good

Fair

Poor

Bad

Not Met

	GPT 4.1	GPT 4.1 Mini	GPT 4.1 Nano	GPT 4o	GPT 4o Mini
Score	1st 85.0%	3rd 75.9%	5th 56.7%	2nd 76.7%	4th 66.3%
100.0%	100%	100%	100%	100%	100%
70.0%	79%	92%	79%	100%	0%
79.2%	100%	100%	0%	100%	96%
60.0%	100%	100%	0%	100%	0%
100.0%	100%	100%	100%	100%	100%
100.0%	100%	100%	100%	100%	100%
20.0%	0%	0%	0%	0%	100%
90.0%	79%	79%	100%	92%	100%
20.0%	100%	0%	0%	0%	0%
82.0%	92%	88%	88%	75%	67%