Recent advancements in artificial intelligence have focused on enhancing prediction accuracy during inference through techniques like chain-of-thought reasoning. This method has yielded significant improvements in benchmark tests, evident from OpenAI’s recent performance. However, leading models like OpenAI's GPTo1 and Google’s Gemini have shown limited effectiveness in practical tasks such as trip planning, achieving low success rates on the TravelPlanner benchmark. Researchers from Google DeepMind propose a new method termed 'mind evolution,' which leverages a genetically inspired algorithm to explore and evaluate multiple potential answers, aiming for greater accuracy in real-world applications.
One of the big trends in artificial intelligence in the past year has been the employment of various tricks during inference -- the act of making predictions -- to dramatically improve the accuracy of those predictions.
The authors adopt a genetically inspired algorithm that induces an LLM, such as Gemini 1.5 Flash, to generate multiple answers to a prompt, which are then evaluated for which is most "fit" to answer the question.
Collection
[
|
...
]