Loading blueprint versions...
Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Measures exact-match retrieval accuracy for numeric lookups across 150 questions using a seeded synthetic dataset of 150 employee records formatted as CSV. Each prompt embeds the full dataset block and asks for a single numeric value.