Showing all evaluation blueprints that have been tagged with...
Showing all evaluation blueprints that have been tagged with "null-assistant".
Demo blueprint exercising assistant:null sequential generation and conversation-aware judging.
Avg. Hybrid Score
Latest:
Unique Versions: 1