Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
This tests whether AI can accurately answer practical questions about civic life in Sri Lanka, based on a specific set of research materials provided by Factum.
Ethnic Relations & Conflict: Assesses understanding of the Sri Lankan Civil War's root causes, the 1983 'Black July' pogrom, allegations of genocide, and the contemporary challenges facing minority communities.
Public Health: Tests knowledge of national health challenges like Chronic Kidney Disease (CKDu) and Tuberculosis (TB), as well as guidance on personal health matters such as contraception, mental health crises, and maternal nutrition.
Electoral Process: Evaluates knowledge of voter eligibility, voting procedures, and the official channels for resolving common issues like a lost ID card or reporting election violations.
Administrative & Legal Procedures: Probes the AI's ability to explain essential civic processes like replacing a lost National Identity Card (NIC), obtaining a Tax Identification Number (TIN), using the Right to Information (RTI) Act, and understanding legal recourse for online harassment.
These prompts were originally sourced from Factum. The rubrics were assembled via Gemini Deep Research.
Average performance for each system prompt variant across all models and prompts.
[No System Prompt]
The user is located in Sri Lanka.
The user is a citizen of Sri Lanka.
Average key point coverage, broken down by system prompt variant. Select a tab to view its results.
| Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4.5 | Claude Sonnet 4 | Deepseek Chat V3.1 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4.1 | GPT 4o Mini | GPT 4o | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Score | 23rd 25.0% | 17th 27.5% | 30th 18.7% | 16th 28.9% | 15th 29.3% | 18th 27.3% | 5th 39.0% | 10th 35.8% | 4th 39.0% | 1st 44.8% | 9th 37.0% | 28th 21.6% | 25th 24.6% | 24th 24.9% | 21st 26.6% | 12th 35.0% | 29th 20.0% | 19th 27.0% | 27th 22.4% | 11th 35.6% | 31st 18.6% | 26th 22.6% | 3rd 40.0% | 13th 34.5% | 22nd 25.2% | 6th 38.7% | 7th 37.9% | 14th 29.5% | 20th 26.9% | 8th 37.5% | 2nd 42.0% | |
| 44.8% | 21% | 44% | 15% | 56% | 50% | 14% | 86% | 79% | 90% | 100% | 44% | 25% | 18% | 19% | 25% | 76% | 36% | 17% | 25% | 66% | 12% | 14% | 63% | 60% | 19% | 72% | 0% | 58% | 19% | 77% | 90% | |
| 60.6% | 36% | 64% | 33% | 57% | 70% | 61% | 73% | 76% | 80% | 86% | 77% | 38% | 52% | 46% | 56% | 66% | 29% | 59% | 43% | 62% | 43% | 47% | 83% | 80% | 50% | 69% | 75% | 48% | 55% | 80% | 86% | |
| 48.2% | 50% | 51% | 41% | 76% | 63% | 59% | 63% | 68% | 31% | 94% | 51% | 53% | 37% | 68% | 59% | 63% | 49% | 20% | 41% | 85% | 26% | 29% | 42% | 4% | 6% | 56% | 76% | 0% | 15% | 63% | 54% | |
| 66.7% | 48% | 73% | 62% | 73% | 68% | 81% | 81% | 67% | 74% | 82% | 81% | 51% | 60% | 46% | 53% | 76% | 47% | 59% | 29% | 56% | 46% | 57% | 86% | 81% | 75% | 66% | 91% | 63% | 75% | 76% | 86% | |
| 2.4% | 0% | 4% | 0% | 3% | 5% | 3% | 3% | 1% | 1% | 3% | 0% | 3% | 1% | 0% | 3% | 9% | 0% | 0% | 4% | 3% | 0% | 4% | 3% | 7% | 3% | 0% | 0% | 3% | 1% | 3% | 3% | |
| 43.7% | 40% | 42% | 29% | 40% | 45% | 42% | 45% | 37% | 80% | 77% | 38% | 35% | 32% | 26% | 32% | 42% | 23% | 27% | 32% | 39% | 29% | 52% | 39% | 61% | 49% | 23% | 82% | 48% | 59% | 45% | 65% | |
| 8.5% | 5% | 5% | 4% | 19% | 4% | 8% | 11% | 6% | 13% | 12% | 3% | 7% | 10% | 14% | 5% | 5% | 5% | 7% | 7% | 13% | 5% | 5% | 12% | 31% | 5% | 4% | 5% | 8% | 4% | 18% | 4% | |
| 24.7% | 45% | 0% | 19% | 16% | 19% | 21% | 31% | 27% | 36% | 14% | 48% | 5% | 34% | 10% | 26% | 21% | 10% | 31% | 20% | 22% | 0% | 24% | 31% | 24% | 46% | 28% | 27% | 38% | 27% | 39% | 27% | |
| 54.9% | 53% | 42% | 38% | 59% | 55% | 48% | 72% | 44% | 63% | 72% | 61% | 36% | 50% | 33% | 41% | 63% | 39% | 42% | 64% | 66% | 44% | 60% | 73% | 61% | 64% | 67% | 67% | 53% | 47% | 65% | 59% | |
| 0.3% | 0% | 0% | 0% | 0% | 0% | 0% | 1% | 0% | 0% | 1% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 8% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | |
| 16.9% | 19% | 13% | 0% | 11% | 18% | 13% | 3% | 36% | 2% | 57% | 23% | 8% | 0% | 20% | 11% | 9% | 8% | 13% | 20% | 14% | 8% | 13% | 64% | 4% | 12% | 47% | 0% | 16% | 5% | 10% | 48% | |
| 16.9% | 5% | 10% | 2% | 24% | 15% | 22% | 29% | 18% | 38% | 20% | 29% | 13% | 0% | 31% | 21% | 21% | 22% | 23% | 11% | 26% | 7% | 2% | 8% | 15% | 0% | 20% | 21% | 8% | 12% | 16% | 36% | |
| 16.0% | 32% | 27% | 17% | 17% | 20% | 14% | 7% | 9% | 17% | 21% | 13% | 9% | 17% | 31% | 7% | 13% | 1% | 23% | 20% | 21% | 9% | 9% | 17% | 10% | 7% | 32% | 32% | 12% | 7% | 15% | 9% | |
| 6.1% | 0% | 2% | 7% | 2% | 0% | 4% | 2% | 8% | 7% | 12% | 12% | 0% | 0% | 0% | 8% | 6% | 2% | 20% | 2% | 27% | 7% | 7% | 10% | 0% | 10% | 22% | 1% | 0% | 0% | 0% | 12% | |
| 28.9% | 33% | 31% | 24% | 34% | 23% | 32% | 45% | 38% | 44% | 43% | 45% | 26% | 24% | 13% | 22% | 31% | 19% | 27% | 0% | 15% | 26% | 0% | 30% | 42% | 27% | 39% | 38% | 31% | 40% | 24% | 31% | |
| 49.4% | 27% | 21% | 33% | 24% | 30% | 36% | 72% | 76% | 63% | 100% | 71% | 39% | 33% | 42% | 31% | 50% | 23% | 40% | 27% | 66% | 32% | 31% | 73% | 69% | 34% | 66% | 80% | 53% | 34% | 71% | 85% | |
| 31.0% | 22% | 26% | 18% | 24% | 22% | 34% | 46% | 36% | 31% | 34% | 43% | 29% | 18% | 31% | 32% | 33% | 29% | 35% | 19% | 35% | 25% | 25% | 15% | 35% | 28% | 46% | 50% | 39% | 28% | 31% | 42% | |
| 28.9% | 5% | 22% | 5% | 4% | 39% | 19% | 43% | 32% | 44% | 25% | 44% | 22% | 38% | 22% | 22% | 30% | 22% | 11% | 22% | 27% | 4% | 22% | 60% | 32% | 0% | 38% | 48% | 53% | 55% | 42% | 45% | |
| 12.7% | 5% | 25% | 4% | 5% | 5% | 0% | 7% | 16% | 12% | 3% | 7% | 0% | 14% | 5% | 16% | 27% | 13% | 32% | 27% | 21% | 11% | 14% | 13% | 20% | 15% | 25% | 5% | 5% | 1% | 20% | 21% | |
| 46.9% | 54% | 49% | 23% | 34% | 34% | 34% | 60% | 42% | 54% | 41% | 49% | 34% | 54% | 42% | 61% | 60% | 23% | 54% | 34% | 49% | 38% | 38% | 70% | 54% | 54% | 54% | 60% | 54% | 54% | 54% | 38% |