w
eval
A Collective Intelligence Project
Home
Tags
Loading...
Evaluations Tagged: ...
Summary of results
Home
Tags
asqa
Evaluations Tagged: "asqa"
Summary of results
ASQA Longform 40
Asqa
Question Answer
Long Form
Ambiguity
Information Synthesis
Nuance
Reasoning
46.2%
Avg. Hybrid Score
Latest Run Heatmap
Top Performing Model:
grok-4-0709
Avg. 58.6%
Latest:
Unique Versions:
2
View Latest Run Analysis
View All Runs for this Blueprint