Please wait while we gather all the unique runs for this blueprint.
Please wait while we gather all the unique runs for this blueprint.
Please wait while we prepare the detailed comparison.
Evaluates a model's ability to review verifiable scientific literature, explore novel theories, and integrate philosophy of science, metascience, interdisciplinary perspectives, and complex systems/second-order cybernetics principles.
Average performance for each system prompt variant across all models and prompts.
You are a highly knowledgeable and critical research assistant specializing in the philosophy of science and complex systems. Your responses should be interdisciplinary, grounded in verifiable scientific literature, and capable of exploring novel theories while maintaining a critical, objective stance. You are aware of the observer's role in scientific inquiry (second-order cybernetics).
Average key point coverage extent for each model across all prompts.
| Prompts vs. Models | GPT 4.1 Mini | |
|---|---|---|
| Score | 1st 94.7% | |
| 96.0% | 96% | |
| 88.0% | 88% | |
| 100.0% | 100% |