Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
Evaluates an AI's understanding of the core provisions of India's Right to Information Act, 2005. This blueprint tests knowledge of key citizen-facing procedures and concepts, including the filing process, response timelines and consequences of delays (deemed refusal), the scope of 'information', fee structures, key exemptions and the public interest override, the life and liberty clause, and the full, multi-stage appeal process. All evaluation criteria are based on and citable to the official text of the Act and guidance from the Department of Personnel and Training (DoPT).
Average key point coverage extent for each model across all prompts.
Prompts vs. Models | Claude 3.5 Sonnet | Claude 3.7 Sonnet | Claude 3.5 Haiku | Claude Opus 4.1 | Claude Sonnet 4 | Command A | Deepseek Chat V3 | Deepseek Chat V3.1 | Deepseek R1 | Gemini 2.5 Flash | Gemini 2.5 Pro | Gemma 3 12b It | Llama 3 70b Instruct | Llama 4 Maverick | Meta Llama 3.1 405b Instruct Turbo | Mistral Large 2411 | Mistral Medium 3 | Mistral Nemo | GPT 4.1 | GPT 4.1 Mini | GPT 4.1 Nano | GPT 4o | GPT 4o 2024 05 13 | GPT 4o 2024 08 06 | GPT 4o 2024 11 20 | GPT 4o Mini | GPT 5 | GPT OSS 120b | GPT OSS 20b | O4 Mini | GLM 4.5 | Qwen3 30b A3B Instruct 2507 | Qwen3 32b | Grok 3 | Grok 4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | 9th 81.2% | 8th 81.8% | 28th 64.4% | 4th 83.8% | 19th 75.7% | 17th 77.2% | 20th 75.3% | 15th 78.5% | 16th 77.8% | 7th 81.9% | 2nd 85.8% | 34th 50.3% | 26th 65.5% | 32nd 56.4% | 21st 74.9% | 18th 76.8% | 3rd 85.7% | 27th 65.3% | 10th 80.7% | 24th 69.4% | 33rd 52.9% | 13th 80.2% | 12th 80.2% | 14th 78.9% | 11th 80.3% | 30th 58.3% | 5th 83.8% | 31st 57.6% | 35th 45.9% | 22nd 74.8% | 23rd 74.1% | 29th 62.8% | 25th 68.8% | 6th 83.0% | 1st 91.8% | |
48.2% | 72% | 56% | 25% | 35% | 38% | 78% | 31% | 50% | 47% | 100% | 50% | 0% | 53% | 0% | 50% | 100% | 50% | 88% | 50% | 75% | 44% | 53% | 50% | 50% | 50% | 22% | 69% | 25% | 13% | 13% | 75% | 0% | 41% | 47% | 88% | |
60.7% | 63% | 63% | 63% | 63% | 63% | 63% | 56% | 63% | 63% | 13% | 63% | 63% | 50% | 63% | 50% | 63% | 63% | 63% | 63% | 63% | 69% | 63% | 63% | 63% | 63% | 63% | 56% | 88% | 50% | 63% | 50% | 63% | 75% | 63% | 56% | |
83.5% | 80% | 85% | 88% | 83% | 85% | 95% | 83% | 100% | 100% | 85% | 70% | 85% | 85% | 85% | 75% | 83% | 93% | 80% | 70% | 90% | 63% | 83% | 85% | 83% | 63% | 60% | 100% | 98% | 90% | 90% | 75% | 85% | 75% | 88% | 85% | |
82.7% | 85% | 100% | 83% | 98% | 88% | 83% | 94% | 83% | 100% | 100% | 100% | 50% | 83% | 67% | 83% | 83% | 96% | 83% | 83% | 58% | 35% | 100% | 83% | 100% | 98% | 67% | 81% | 81% | 65% | 83% | 67% | 67% | 67% | 100% | 100% | |
58.1% | 75% | 75% | 47% | 50% | 50% | 75% | 47% | 75% | 75% | 75% | 75% | 25% | 75% | 28% | 100% | 75% | 75% | 41% | 75% | 31% | 28% | 78% | 75% | 75% | 75% | 50% | 59% | 22% | 0% | 50% | 72% | 32% | 22% | 75% | 75% | |
96.6% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 94% | 50% | 38% | 100% | 100% | 100% | 100% | 100% | 100% | |
81.5% | 91% | 91% | 70% | 100% | 100% | 97% | 95% | 98% | 75% | 100% | 100% | 78% | 41% | 43% | 73% | 100% | 94% | 63% | 94% | 100% | 66% | 90% | 88% | 93% | 94% | 66% | 98% | 73% | 30% | 80% | 84% | 69% | 33% | 97% | 90% | |
51.1% | 46% | 38% | 50% | 67% | 63% | 46% | 54% | 46% | 50% | 54% | 58% | 46% | 38% | 42% | 42% | 75% | 71% | 42% | 42% | 42% | 38% | 50% | 54% | 46% | 42% | 38% | 38% | 71% | 42% | 75% | 46% | 38% | 46% | 100% | ||
91.7% | 100% | 100% | 100% | 100% | 97% | 88% | 100% | 100% | 94% | 94% | 100% | 75% | 100% | 63% | 75% | 78% | 100% | 75% | 100% | 94% | 94% | 84% | 78% | 78% | 100% | 91% | 100% | 97% | 81% | 100% | 94% | 94% | 84% | 100% | 100% | |
83.4% | 100% | 67% | 58% | 100% | 100% | 100% | 100% | 96% | 100% | 100% | 100% | 33% | 92% | 67% | 63% | 67% | 100% | 67% | 100% | 67% | 67% | 100% | 67% | 100% | 100% | 67% | 100% | 33% | 42% | 100% | 100% | 67% | 100% | 100% | 100% | |
70.3% | 69% | 88% | 34% | 100% | 75% | 78% | 100% | 97% | 100% | 100% | 100% | 28% | 28% | 19% | 75% | 53% | 72% | 34% | 72% | 75% | 13% | 78% | 100% | 75% | 84% | 34% | 100% | 28% | 0% | 100% | 100% | 75% | 78% | 100% | 100% | |
96.3% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 71% | 100% | 100% | 100% | 71% | 100% | 100% | 100% | 100% | 71% | 88% | 100% | 100% | 100% | 100% | 100% | 83% | 100% | 100% | 100% | 88% | 100% | 100% | 100% | |
46.9% | 75% | 100% | 19% | 94% | 25% | 0% | 19% | 13% | 7% | 44% | 100% | 0% | 7% | 56% | 88% | 50% | 100% | 13% | 100% | 7% | 0% | 75% | 100% | 63% | 75% | 0% | 94% | 0% | 19% | 0% | 38% | 50% | 63% | 100% |