Kyutai Releases Hibiki: A 2.7B Real-Time Speech-to-Speech and Speech-to-Text Translation with Near-Human Quality and Voice Transfer

Kyutai Releases Hibiki: A 2.7B Real-Time Speech-to-Speech and Speech-to-Text Translation with Near-Human Quality and Voice Transfer

Real-Time Speech Translation Made Simple

Understanding the Challenge

Real-time speech translation combines three complex technologies: speech recognition, machine translation, and text-to-speech. Traditional methods often face issues like errors, loss of speaker identity, and slow processing speeds, making them unsuitable for live interpretations. Current models struggle to balance accuracy and speed due to complicated processes and a lack of high-quality speech data.

Introducing Hibiki

Kyutai has created Hibiki, a powerful translation model with 2.7 billion parameters. It can translate speech in real-time from French to English while keeping the original voice’s characteristics. Hibiki operates at a fast rate of 12.5Hz and is also available in a compact version, Hibiki-M, optimized for smartphones.

How Hibiki Works

Hibiki uses a unique decoder-only architecture that allows it to process speech simultaneously. It employs a neural audio codec, ensuring high-quality audio compression. A special feature called contextual alignment helps manage translation timing, allowing for smooth and coherent translations. Hibiki can handle up to 320 sequences at once, making it suitable for large-scale applications. It has been trained on millions of hours of audio data, enhancing its performance across different speech patterns.

Performance Highlights

Hibiki excels in translation quality and speaker fidelity. It has an ASR-BLEU score of 30.5, outperforming many existing models. Human evaluations rate its naturalness at 3.73 out of 5, nearing the score of professional interpreters. Hibiki also shows better voice transfer compared to other models, maintaining competitive speed.

Conclusion

Hibiki offers a practical solution for real-time speech translation, combining advanced features for improved quality and natural speech. It is available as open-source software, which can significantly enhance multilingual communication.

Explore Hibiki Models

– Hibiki 2B for PyTorch (bf16): kyutai/hibiki-2b-pytorch-bf16
– Hibiki 1B for PyTorch (bf16): kyutai/hibiki-1b-pytorch-bf16
– Hibiki 2B for MLX (bf16): kyutai/hibiki-2b-mlx-bf16
– Hibiki 1B for MLX (bf16): kyutai/hibiki-1b-mlx-bf16

Stay Connected

For more insights, check out our Paper, Models on Hugging Face, GitHub Page, and Colab Notebook. Follow us on Twitter, join our Telegram Channel, and connect with our LinkedIn Group. Join our growing machine learning community on Twitter/X.

Transform Your Business with AI

Evolve your company with AI to stay competitive. Here’s how:
– **Identify Automation Opportunities**: Find customer interaction points that can benefit from AI.
– **Define KPIs**: Ensure measurable impacts on business outcomes.
– **Select an AI Solution**: Choose tools that fit your needs and allow customization.
– **Implement Gradually**: Start small, gather data, and expand wisely.

For AI KPI management advice, contact us at hello@itinai.com. For ongoing updates on leveraging AI, follow us on Telegram or Twitter @itinaicom.

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.