Apple Open-Sources One Billion Parameter Language Model OpenELM
Briefly

Diverges from prior practices by providing complete training and evaluation framework, empowering the open research community for future endeavors.
Layer-wise attention scaling in OpenELM allocates different numbers of dimensions and parameters in lower and higher layers, enhancing model accuracy.
Read at InfoQ
[
|
]