Notus, a new language model, builds on Zephyr’s success by fine-tuning data curation, prioritizing high-quality data from UltraFeedback and emphasizing user preference alignment. Implementing a meticulous curation process, Notus aims to elevate language model performance by reiterating response generation and AI ranking stages. These efforts have resulted in competitive performance and a commitment to open-source contribution. [Word count: 65]
“`html
Meet Notus: Enhancing Language Models with Data-Driven Fine-Tuning
In the pursuit of refining language models to align more closely with user intent and elevate response quality, a new iteration emerges – Notus. Stemming from Zephyr’s foundations, Notus, a fine-tuned version of Data Preference Optimization (DPO), emphasizes high-quality data curation for a more refined response generation process.
Zephyr 7B Beta and Notus
Zephyr 7B Beta, released recently, marked a significant stride in creating a more compact yet intent-aligned Language Model (LLM). Their methodology involved distilled Supervised Fine-Tuning (dSFT) followed by distilled Direct Preference Optimization (dDPO) using AI Feedback (AIF) datasets like UltraFeedback.
Recognizing the benefits of applying DPO after SFT, Zephyr 7B Beta surpassed other models, outperforming larger counterparts like Llama 2 Chat 70B. Notus builds upon this success, taking a different approach to data curation for enhanced model fine-tuning.
Notus’ Approach
The foundation for Notus lies in leveraging the same data source as Zephyr – openbmb/UltraFeedback. However, Notus pivots towards prioritizing high-quality data through meticulous curation. UltraFeedback contains responses evaluated using GPT-4, each assigned scores across preference areas (instruction-following, truthfulness, honesty, and helpfulness), alongside rationales and an overall critique score.
To curate a dataset conducive to DPO, Notus computed the average of preference ratings and selected the response with the highest average as the chosen one, ensuring its superiority over a randomly selected rejected response.
Notus’ Efficacy and Future Plans
The results spoke volumes about Notus’ efficacy. It nearly matched Zephyr on MT-Bench while outperforming Zephyr, Claude 2, and Cohere Command on AlpacaEval, solidifying its position among the most competitive 7B commercial models.
Looking ahead, Notus and its developers at Argilla remain steadfast in their commitment to a data-first approach. They are actively crafting an AI Feedback (AIF) framework to collect LLM-generated feedback, aspiring to create high-quality synthetic labeled datasets akin to UltraFeedback.
In conclusion, Notus emerges as a testament to the power of meticulous data curation in fine-tuning language models, setting a new benchmark for intent-aligned, high-quality responses in AI-driven language generation.
Practical AI Solution: AI Sales Bot
Consider the AI Sales Bot from itinai.com/aisalesbot designed to automate customer engagement 24/7 and manage interactions across all customer journey stages.
Discover how AI can redefine your sales processes and customer engagement. Explore solutions at itinai.com.
“`