The results indicate that LLMs exhibit varying sensitivities to the number of distractors in prompts, with discrepancies in accuracy across different reasoning tasks and models.
Gemma-7b demonstrated a significant decline in accuracy as the number of distractors increased in particular reasoning schemes, indicating a potential vulnerability under these conditions.
In contrast, Mistral-7B showed improvement in accuracy with higher number of distractors in some reasoning tasks, suggesting that few-shot learning can help mitigate these effects.
The findings emphasize that as the number of distractors increases, LLMs like Gemma-7b experience substantial declines in reasoning performance, whereas other models may adapt more effectively.
Collection
[
|
...
]