Chinese AGI Startup ‘StepFun’ Developed ‘Step-2’: A New Trillion-Parameter MoE Architecture Model Ranking 5th on Livebench

Chinese AGI Startup ‘StepFun’ Developed ‘Step-2’: A New Trillion-Parameter MoE Architecture Model Ranking 5th on Livebench

Understanding the Challenges of AI Language Models

Creating language models that mimic human understanding is a tough task in AI. A key challenge is achieving a balance between computational efficiency and the ability to perform a wide range of tasks. As models become larger to improve their capabilities, the costs of computation also rise significantly. General-purpose language models often struggle to perform consistently across different tasks, which is a barrier to achieving advanced artificial general intelligence (AGI).

Introducing Step-2: A Trillion-Parameter MoE Model

StepFun, an AI startup from Shanghai, has launched Step-2, a trillion-parameter Mixture of Experts (MoE) language model. This model is notable for ranking 5th on Livebench, a global platform that evaluates AI models on various tasks. Step-2 is the first trillion-parameter MoE model developed by a Chinese company, showcasing advanced technology and contributing to the global AI community.

Efficient Architecture

Step-2 utilizes MoE architecture, which optimizes computational resources better than traditional models. It activates only a subset of parameters for each task, allowing for a large number of parameters without a proportional increase in computation. This design enhances its understanding of language and improves its ability to follow instructions and reason. It can also handle long contexts of up to 16,000 tokens, making it suitable for tasks like document analysis and complex conversations.

Performance Highlights

Step-2 has shown impressive performance metrics, scoring 86.57 in Instruction Following and 58.67 in reasoning tasks. However, it has areas for improvement in coding and mathematics, with scores of 46.87 and 48.88, respectively. Despite these challenges, the model effectively balances its large scale with task-specific efficiency, focusing on research and development to ensure reliability.

Significance and Accessibility

Step-2’s significance lies in its scale and competitive ranking as the first trillion-parameter model from a Chinese startup. StepFun has made this model accessible through its API platform for developers and researchers. Additionally, it is integrated into the consumer application “Yuewen,” allowing the public to interact with this advanced language model. This achievement indicates that Chinese startups can produce high-quality AI systems, fostering a more diverse AI landscape.

Conclusion

Step-2 by StepFun marks a significant advancement for the Chinese AI community, showcasing its capabilities in instruction following and reasoning while highlighting areas for further improvement. With its innovative MoE architecture and extensive parameter scale, Step-2 exemplifies the potential for creating efficient AI models. Its accessibility through APIs and consumer applications reflects StepFun’s commitment to making advanced technology available to users worldwide. As AI continues to evolve, Step-2 positions StepFun as a key player in the industry, paving the way for future developments in AGI.

Get Involved

Explore how AI can transform your business. Identify automation opportunities, define KPIs, select suitable AI solutions, and implement gradually. For AI KPI management advice, reach out to us at hello@itinai.com. Stay updated on AI insights via our Telegram channel or Twitter.

Join Our Free AI Virtual Conference

Don’t miss SmallCon, a free virtual GenAI conference featuring industry leaders on December 11th. Learn about building with small models from experts like Meta, Mistral, and Salesforce.

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.