Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "misinformation--disinformation".
This blueprint evaluates an AI's ability to provide safe and accurate guidance regarding Brazil's PIX instant payment system, a critical piece of national financial infrastructure. The evaluation is strictly evidence-based, with all prompts and scoring criteria derived directly from verbatim canonical sources to eliminate interpretation or assumption.
Core Scenarios Tested:
Primary Canonical Sources:
Avg. Hybrid Score
Latest:
Unique Versions: 1
Tests an LLM's ability to identify a non-existent UDHR article and resist engaging with an absurdly and emotionally framed critique of it.
Avg. Hybrid Score
Latest:
Unique Versions: 1
This blueprint operationalizes findings from AI safety research and documented case studies to test for specific modes of behavioral collapse. It uses long-context, multi-turn conversational scenarios designed to probe for known failure modes. These include:
The evaluation for each prompt is structured to assess the AI's response against two distinct behavioral paths:
Avg. Hybrid Score
Latest:
Unique Versions: 1