Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
Evaluates an AI's understanding of the core provisions of India's Right to Information Act, 2005. This blueprint tests knowledge of key citizen-facing procedures and concepts, including the filing process, response timelines and consequences of delays (deemed refusal), the scope of 'information', fee structures, key exemptions and the public interest override, the life and liberty clause, and the full, multi-stage appeal process. All evaluation criteria are based on and citable to the official text of the Act and guidance from the Department of Personnel and Training (DoPT).
Average key point coverage extent for each model across all prompts.
Prompts vs. Models | Claude 3 5 Sonnet | Claude 3 7 Sonnet | Claude 3.5 Haiku | Claude Opus 4 | Claude Sonnet 4 | Command A | Deepseek Chat V3 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | GPT 4.1 | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4o | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | O4 Mini | Kimi K2 Instruct | Grok 3 | Grok 3 Mini | Grok 4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | 6th 82.2% | 12th 80.5% | 20th 69.7% | 4th 84.6% | 17th 76.8% | 27th 53.8% | 14th 78.6% | 13th 79.3% | 3rd 87.9% | 2nd 92.4% | 23rd 60.2% | 25th 56.0% | 21st 66.3% | 18th 73.5% | 11th 80.6% | 8th 82.1% | 22nd 65.2% | 26th 55.6% | 16th 77.7% | 15th 78.5% | 6th 82.2% | 10th 81.2% | 24th 57.3% | 19th 71.7% | 28th 47.0% | 9th 81.5% | 5th 83.3% | 1st 94.7% | |
49.4% | 78% | 66% | 25% | 84% | 44% | 91% | 10% | 10% | 100% | 100% | 56% | 0% | 59% | 47% | 50% | 53% | 50% | 0% | 53% | 25% | 53% | 56% | 13% | 38% | 0% | 56% | 66% | 100% | |
59.9% | 63% | 63% | 63% | 63% | 75% | 50% | 56% | 75% | 25% | 63% | 50% | 56% | 50% | 63% | 63% | 69% | 50% | 63% | 63% | 63% | 63% | 63% | 63% | 63% | 50% | 63% | 63% | 63% | |
82.5% | 85% | 83% | 75% | 90% | 85% | 65% | 90% | 95% | 93% | 90% | 85% | 75% | 75% | 85% | 83% | 65% | 85% | 63% | 85% | 85% | 85% | 83% | 63% | 80% | 88% | 93% | 98% | ||
83.9% | 92% | 81% | 83% | 100% | 73% | 21% | 98% | 81% | 100% | 100% | 81% | 83% | 67% | 67% | 96% | 98% | 73% | 52% | 94% | 83% | 100% | 96% | 67% | 79% | 100% | 100% | 100% | ||
63.9% | 75% | 75% | 75% | 50% | 50% | 59% | 75% | 75% | 75% | 75% | 50% | 50% | 75% | 75% | 63% | 75% | 19% | 28% | 78% | 75% | 75% | 75% | 50% | 50% | 53% | 75% | 75% | ||
96.9% | 100% | 100% | 100% | 100% | 100% | 13% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
80.0% | 78% | 95% | 75% | 98% | 85% | 78% | 90% | 95% | 100% | 98% | 35% | 0% | 73% | 93% | 78% | 100% | 78% | 63% | 78% | 78% | 95% | 95% | 50% | 78% | 98% | 78% | 98% | ||
55.5% | 54% | 42% | 58% | 71% | 79% | 34% | 71% | 25% | 50% | 75% | 38% | 54% | 50% | 54% | 58% | 38% | 38% | 42% | 58% | 54% | 54% | 63% | 46% | 75% | 38% | 63% | 71% | 100% | |
92.8% | 100% | 100% | 100% | 100% | 100% | 78% | 94% | 100% | 100% | 100% | 75% | 97% | 75% | 81% | 94% | 100% | 100% | 88% | 84% | 88% | 88% | 94% | 88% | 97% | 100% | 84% | 100% | ||
89.1% | 100% | 67% | 58% | 100% | 100% | 58% | 100% | 100% | 100% | 100% | 63% | 96% | 63% | 100% | 100% | 100% | 67% | 100% | 67% | 100% | 100% | 100% | 67% | 100% | 100% | 100% | 100% | ||
75.9% | 69% | 100% | 44% | 100% | 88% | 53% | 100% | 100% | 100% | 100% | 50% | 10% | 50% | 53% | 69% | 69% | 75% | 44% | 100% | 100% | 100% | 75% | 38% | 88% | 100% | 78% | 97% | ||
97.7% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 67% | 100% | 100% | 100% | 100% | 100% | 71% | 100% | 100% | 100% | ||
49.0% | 75% | 75% | 50% | 44% | 19% | 0% | 38% | 75% | 100% | 100% | 0% | 7% | 25% | 38% | 94% | 100% | 13% | 13% | 50% | 69% | 56% | 56% | 0% | 13% | 38% | 75% | 100% |