Misalignment Between Instructions and Responses in Domain-Specific LLM Tasks | HackerNoon
Briefly

We observed four types of text outputs: those aligned with the instruction, empty outputs, incorrect outputs such as repeated prompts, and outputs resembling Chain-of-Thought reasoning, which, despite potentially containing correct reasoning, did not align with given instructions.
BioMistral-7B consistently generated empty outputs in 100% of cases across all settings, while Meta-Llama-3-8B displayed similar behavior in zero-shot settings, which we attribute to safety mechanisms applied during pre-training.
Mistral-7B-v0.1 responses repeat the prompt text in 88% of the ZS settings, while CoT outputs for Meta-Llama-3-8B Instruct often ignored specific instructions, highlighting biases from instruction-tuning.
The results indicate that the models prioritize contextual knowledge over necessary background knowledge, contributing to challenges in generating accurate outputs in domain-specific tasks.
Read at Hackernoon
[
|
]