Researchers from Princeton have introduced Sheared-LLaMA models, which are smaller but stronger versions of large language models (LLMs), created through focused structured pruning. The method, which involves targeted structured pruning and dynamic batch loading, effectively reduces the size of LLMs while maintaining their performance. The Sheared-LLaMA models outperformed other LLMs of similar sizes in various tasks and can be used for models of any magnitude.
Researchers from Princeton Introduce ShearedLLaMA Models for Accelerating Language Model Pre-Training via Structured Pruning
Introduction
Large Language Models (LLMs) have gained popularity due to their exceptional capabilities in natural language tasks. However, training these models requires massive computational resources. To address this, researchers have developed more compact and effective LLMs through structured pruning. This technique involves targeted structured pruning and dynamic batch loading.
Targeted Structured Pruning
Targeted Structured Pruning systematically removes layers, heads, and dimensions from a larger language model, reducing it to a target configuration. This optimization process preserves the model’s coherence and functioning while improving efficiency.
Dynamic Batch Loading
Dynamic Batch Loading modifies the training data composition within each batch based on the model’s performance in different domains. By dynamically adjusting the data samples, the model can focus on areas where it needs improvement, enhancing overall efficiency.
Sheared-LLaMA Models
Sheared-LLaMA-1.3B and Sheared-LLaMA-2.7B are smaller LLMs created through pruning an LLaMA2-7B model. Despite using only 5% of the training set, these models outperform other well-known LLMs in various downstream tasks, such as open-ended generation, reading comprehension, common sense understanding, and world knowledge.
Benefits and Future Potential
Further training with more tokens can potentially yield even greater benefits. While the study focused on models with a maximum of 7 billion parameters, the LLM-shearing technique can be applied to language models of any size. This approach offers a cost-effective way to develop smaller yet powerful LLMs for a wide range of applications.
Practical AI Solutions
To evolve your company with AI and stay competitive, consider using ShearedLLaMA Models for accelerating language model pre-training. Identify automation opportunities, define measurable KPIs, select customized AI solutions, and implement gradually. For AI KPI management advice, connect with us at hello@itinai.com.
AI Sales Bot
Explore the AI Sales Bot from itinai.com/aisalesbot, designed to automate customer engagement and manage interactions across all stages of the customer journey. Discover how AI can redefine your sales processes and customer engagement.
For more AI research news and insights, join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter. Stay updated on the latest AI advancements and cool AI projects.
If you’re interested in our work, subscribe to our newsletter and join our AI Channel on WhatsApp for continuous insights into leveraging AI.