Can Pictionary and Minecraft test AI models' ingenuity? | TechCrunchAI benchmarks often lack relevance and can be manipulated; game-like tasks may provide better insights into AI capabilities.