Large language model
On-device machine learning moves computation to personal devices, enhancing user privacy and experiences. However, optimizing models on limited resources poses challenges. To address this, Talaria, a model visualization and optimization system, aids in compiling models to hardware and interactive visualization of model statistics for efficient on-device machine learning.
OpenELM, a state-of-the-art open language model, prioritizes reproducibility and transparency in large language models. It employs a layer-wise scaling strategy to efficiently allocate parameters within each layer, resulting in enhanced accuracy. For instance, with a parameter budget of one billion, OpenELM shows a 2.36% accuracy improvement compared to OLMo.
This paper presents the Slingshot Effect, a phenomenon in neural network optimization occurring in late training stages. It involves cyclic phase transitions between stable and unstable training regimes, demonstrated by cyclic behavior of the last layer’s weight norm. The effect can be replicated in various settings, but its implications remain unexplored.