UX design
fromContralabs
3 days agoContra Labs - Human Creativity Benchmark
Evaluator agreement reflects shared best practices, while evaluator disagreement reflects legitimate taste and intent; separating these signals shows no current model is both reliably correct and steerable.