The Impact of Parameters on LLM Performance | HackerNoon
Briefly

This article explores the importance of managing parameter quantization in large language models (LLMs), particularly concerning 'cherry parameters' which, despite their small proportion in count, significantly influence model performance. The authors suggest preserving the high-precision values of these critical parameters during quantization to maintain model integrity. They also discuss challenges related to mixed-precision optimization using the GPTQ approach, highlighting the difficulties in optimizing high and low-precision parameters simultaneously, especially given that once quantized, parameters cannot be updated.
The cherry parameters, despite constituting less than 1% of the total parameter count, exert a substantial influence on the model; indiscriminately quantizing them may deteriorate performance.
To mitigate the impact of cherry parameters on quantization, we propose to preserve their high-precision values during the quantization process, ensuring that essential information is not compromised.
Optimizing mixed-precision parameters in LLMs presents a unique challenge; the widely adopted GPTQ approach struggles to optimize high-precision cherry parameters alongside low-precision normal parameters.
Once the parameters are quantized in the PTQ framework, they cannot be updated further, which prevents reaching optimal values for early-stage quantized parameters.
Read at Hackernoon
[
|
]