Platform Workers in Southeast Asia

Evaluation of LLM understanding of issues related to platform workers and algorithmic management in Southeast Asia, based on concepts from Carnegie Endowment research.

TAGS:

Asia

Platform Workers

FEATURED

Business & Management

Economic Justice & Inequality

Human Rights

Instruction Following & Prompt Adherence

Labor & Workers' Rights

Reasoning

Sociology & Anthropology

AI Safety & Robustness

Best Models (Coverage)

1.GPT 4.1 Mini
96.9%
2.Grok 3 Mini
96.3%
3.Grok 3
95.4%
4.Gemini 2.5 Flash
95.3%
5.Deepseek Chat V3
94.1%

🤔 Most Differentiating Prompt

User: What is the potential impact on platform workers when a few large companies dominate the market for app-based services like ride-hailing or food delivery?

σ = 0.185

🔀 Least Similar Models

Claude 3.5 HaikuvsGemini 2.5 Pro Preview 05 06

77.9% similarity

👯 Most Similar Models

GPT 4o MinivsGPT 4o

95.4% similarity

View Blueprint

Select Prompt:

Macro Coverage Overview

Average key point coverage extent for each model across all prompts.

Pro Tip

Click on any result cell to open a detailed view.

Advanced view

Highlight best performers

Sort prompts by

Sort models by

Color Scale - Simplified View (Avg. Coverage)

Perfect

Excellent

Good

Fair

Poor

Bad

Not Met

	Claude 3.5 Haiku	Claude Sonnet 4	Command A	Deepseek Chat V3	Gemini 2.5 Flash	Gemini 2.5 Pro Preview 05 06	Mistral Large 2411	Mistral Medium 3	GPT 4.1	GPT 4.1 Mini	GPT 4.1 Nano	GPT 4o	GPT 4o Mini	Grok 3	Grok 3 Mini
Score	14th 74.8%	9th 91.3%	8th 93.0%	5th 94.1%	4th 95.3%	15th 74.0%	10th 91.0%	7th 93.3%	6th 93.8%	1st 96.9%	13th 85.3%	11th 88.1%	12th 86.8%	3rd 95.4%	2nd 96.2%
82.5%	69%	78%	91%	78%	78%	78%	81%	78%	84%	97%	81%	75%	88%	81%	100%
94.7%	73%	93%	95%	100%	100%	93%	100%	100%	98%	100%	93%	90%	90%	95%	100%
92.1%	88%	100%	100%	100%	100%	38%	100%	100%	100%	100%	83%	98%	75%	100%	100%
86.7%	82%	80%	97%	88%	98%	48%	84%	96%	97%	100%	66%	79%	88%	100%	98%
84.7%	53%	90%	85%	93%	100%	50%	78%	95%	95%	90%	85%	88%	80%	98%	90%
96.2%	88%	95%	98%	100%	100%	95%	93%	100%	98%	100%	93%	88%	95%	100%	100%
99.0%	100%	97%	100%	100%	100%	97%	100%	97%	100%	100%	100%	100%	94%	100%	100%
87.1%	65%	85%	92%	96%	98%	60%	92%	96%	83%	100%	71%	77%	94%	98%	100%
91.1%	30%	98%	100%	100%	98%	98%	98%	93%	98%	100%	80%	100%	75%	100%	98%
97.9%	95%	98%	100%	98%	100%	88%	95%	100%	100%	98%	98%	98%	100%	100%	100%
77.1%	69%	81%	75%	81%	79%	50%	81%	81%	83%	83%	83%	81%	75%	75%	79%
90.4%	85%	100%	83%	95%	93%	93%	90%	83%	90%	95%	90%	83%	88%	98%	90%

Model Similarity Dendrogram

Hierarchical clustering of models based on response similarity. Models grouped closer are more similar.