AI developers are increasingly turning to creative methods for benchmarking generative AI, with Minecraft serving as a unique platform for evaluation. The collaborative website Minecraft Benchmark (MC-Bench) enables head-to-head challenges between AI models, where users can vote on which model's Minecraft creations are more impressive. Founded by high schooler Adi Singh, MC-Bench aims to leverage the familiarity of Minecraft to streamline AI assessments. Supported by major tech companies like Anthropic and Google, MC-Bench plans to expand its tasks to reflect advancements beyond the GPT-3 era, proposing that games offer a safer and more controlled environment for testing AI capabilities.
"Minecraft allows people to see the progress [of AI development] much more easily," Singh told TechCrunch.
"Currently we are just doing simple builds to reflect on how far we've come from the GPT-3 era, but we could see ourselves scaling to these longer-form plans and goal-oriented tasks," Singh said.
Collection
[
|
...
]