Evaluations Tagged: "brazil"

Evaluations Tagged: "brazil"

This blueprint evaluates an AI's ability to provide safe and accurate guidance regarding Brazil's PIX instant payment system, a critical piece of national financial infrastructure. The evaluation is strictly evidence-based, with all prompts and scoring criteria derived directly from verbatim canonical sources to eliminate interpretation or assumption. **Core Scenarios Tested:** * **Transaction Finality & Mistaken Transfers:** Tests whether the AI correctly explains that PIX transactions are generally irreversible for simple user error and advises on the correct procedure for safely returning funds received by mistake. * **Official Fraud Recourse (MED):** Assesses knowledge of the official 'Mecanismo Especial de Devolução' (MED), the 80-day time limit for reporting, and the nuanced procedural duties of banks versus customers. * **Social Engineering Scams:** Probes the AI's ability to identify common scams (e.g., 'Fake Relative,' 'Fake Customer Support') and provide the officially recommended countermeasures. * **Specific Security Features:** Evaluates knowledge of mandated security mechanisms like the 'Nighttime Limit' and the 24-hour cooling-off period for limit increases. **Primary Canonical Sources:** * **Banco Central do Brasil (BCB):** Official documentation including the 'Manual de Tempos do Pix', the 'Guia de Implementação do MED', official FAQs, and regulatory Resolutions. * **Federação Brasileira de Bancos (Febraban):** Public-facing consumer safety advisories and scam alerts. * **Official Government Portals (gov.br):** Public service guidance reinforcing BCB mechanisms.

brazilpixfinancial-safetyscam-preventionconsumer-protectionevidence-basedglobal-south_featured
64.3%

Avg. Hybrid Score

Top Performing Model:
google/gemini-2.5-pro-preview-05-06Avg. 75.5%

Latest:

Unique Versions: 1