Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
Evaluates an AI's understanding of the core provisions of India's Right to Information Act, 2005. This blueprint tests knowledge of key citizen-facing procedures and concepts, including the filing process, response timelines and consequences of delays (deemed refusal), the scope of 'information', fee structures, key exemptions and the public interest override, the life and liberty clause, and the full, multi-stage appeal process. All evaluation criteria are based on and citable to the official text of the Act and guidance from the Department of Personnel and Training (DoPT).
Average key point coverage extent for each model across all prompts.
| Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Haiku 4.5 | Claude Opus 4.1 | Claude Sonnet 4 | Claude Sonnet 4.5 | Deepseek Chat V3.1 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4o | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Score | 12th 77.8% | 9th 79.9% | 22nd 66.2% | 30th 56.8% | 8th 80.1% | 19th 71.9% | 18th 73.1% | 7th 80.4% | 20th 71.4% | 5th 83.5% | 1st 89.7% | 31st 56.5% | 26th 58.2% | 25th 58.6% | 23rd 63.5% | 21st 66.5% | 6th 80.5% | 29th 57.1% | 10th 79.8% | 24th 61.3% | 28th 57.3% | 13th 76.8% | 16th 76.0% | 17th 74.1% | 15th 76.6% | 32nd 56.0% | 3rd 84.8% | 34th 53.3% | 35th 40.0% | 14th 76.8% | 11th 79.5% | 33rd 55.2% | 27th 57.4% | 4th 84.5% | 2nd 87.3% | |
| 37.1% | 66% | 55% | 14% | 27% | 35% | 37% | 68% | 95% | 40% | 100% | 50% | 33% | 38% | 0% | 1% | 0% | 51% | 10% | 25% | 22% | 24% | 45% | 25% | 50% | 51% | 26% | 73% | 27% | 10% | 40% | 26% | 0% | 27% | 54% | 52% | |
| 58.1% | 63% | 56% | 63% | 65% | 51% | 63% | 55% | 69% | 63% | 44% | 56% | 63% | 63% | 2% | 41% | 69% | 69% | 50% | 69% | 50% | 60% | 61% | 63% | 69% | 59% | 63% | 44% | 56% | 56% | 69% | 59% | 60% | 59% | 67% | 63% | |
| 73.9% | 77% | 76% | 73% | 58% | 77% | 82% | 57% | 86% | 85% | 74% | 85% | 78% | 83% | 55% | 53% | 78% | 84% | 70% | 63% | 84% | 57% | 74% | 77% | 79% | 57% | 58% | 98% | 68% | 77% | 85% | 71% | 80% | 74% | 81% | ||
| 83.2% | 81% | 100% | 100% | 67% | 84% | 90% | 83% | 98% | 78% | 100% | 100% | 65% | 65% | 74% | 67% | 91% | 87% | 67% | 99% | 61% | 51% | 99% | 81% | 98% | 100% | 67% | 91% | 83% | 65% | 90% | 86% | 67% | 78% | 100% | 99% | |
| 60.0% | 75% | 75% | 48% | 51% | 53% | 50% | 100% | 51% | 70% | 75% | 75% | 18% | 50% | 66% | 76% | 75% | 59% | 24% | 80% | 75% | 31% | 75% | 75% | 75% | 75% | 31% | 100% | 17% | 0% | 100% | 75% | 30% | 18% | 78% | 75% | |
| 99.9% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 96% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 70.1% | 70% | 91% | 66% | 26% | 93% | 77% | 92% | 88% | 71% | 100% | 100% | 74% | 20% | 38% | 70% | 80% | 78% | 64% | 73% | 70% | 39% | 72% | 70% | 71% | 87% | 50% | 90% | 50% | 34% | 80% | 83% | 68% | 36% | 94% | 88% | |
| 59.0% | 74% | 53% | 64% | 67% | 79% | 77% | 46% | 68% | 28% | 35% | 100% | 67% | 33% | 71% | 60% | 85% | 59% | 63% | 54% | 38% | 59% | 51% | 68% | 58% | 75% | 40% | 38% | 59% | 57% | 62% | 39% | 24% | 56% | 100% | ||
| 90.3% | 95% | 100% | 100% | 79% | 100% | 91% | 100% | 100% | 100% | 100% | 100% | 84% | 92% | 61% | 83% | 68% | 100% | 72% | 100% | 95% | 84% | 83% | 79% | 81% | 88% | 81% | 100% | 100% | 88% | 91% | 100% | 79% | 88% | 100% | 97% | |
| 80.5% | 71% | 67% | 50% | 86% | 100% | 96% | 67% | 83% | 67% | 100% | 100% | 33% | 86% | 96% | 86% | 67% | 85% | 69% | 100% | 67% | 100% | 100% | 100% | 100% | 100% | 67% | 100% | 33% | 0% | 100% | 94% | 67% | 86% | 96% | 100% | |
| 68.9% | 70% | 100% | 69% | 27% | 100% | 72% | 72% | 100% | 100% | 80% | 100% | 42% | 13% | 70% | 67% | 59% | 75% | 53% | 75% | 41% | 73% | 100% | 97% | 50% | 75% | 42% | 100% | 33% | 1% | 75% | 100% | 70% | 31% | 100% | 80% | |
| 92.4% | 100% | 97% | 100% | 79% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 74% | 100% | 100% | 100% | 67% | 100% | 100% | 100% | 71% | 67% | 100% | 100% | 100% | 100% | 100% | 100% | 67% | 46% | 100% | 100% | 67% | 100% | 100% | 100% | |
| 37.4% | 69% | 69% | 14% | 7% | 69% | 0% | 10% | 7% | 26% | 81% | 100% | 3% | 13% | 29% | 22% | 26% | 100% | 0% | 100% | 23% | 0% | 38% | 53% | 32% | 29% | 3% | 69% | 0% | 19% | 63% | 0% | 19% | 79% | 100% |