Hume Introduces Octave TTS: A New Text-to-Speech Model that Creates Custom AI Voices with Tailored Emotions

Challenges in Traditional Text-to-Speech Systems

Traditional text-to-speech (TTS) systems often struggle to convey human emotion and nuance, producing speech in a flat tone. This limitation affects developers and content creators who want their messages to truly resonate with audiences. There is a clear need for TTS systems that interpret context and emotion rather than simply voicing words.

Introducing Hume’s Octave TTS

Hume’s Octave TTS is a significant advancement in text-to-speech technology. Unlike previous models that generate speech mechanically, Octave understands the context of the text. It conveys subtle meanings, emotions, and styles. Whether the text requires sarcasm, a gentle tone, or a strong declaration, Octave adapts its output to reflect the intended message. This capability allows for custom AI voices suited to various scenarios, from narration to character-driven storytelling.

Technical Features

Octave TTS is based on a state-of-the-art large language model (LLM) specifically trained for speech synthesis. This advanced foundation enables the system to predict not only the words but also how they should be delivered, considering rhythm, timbre, and cadence. A distinguishing feature of Octave is its “Voice Design” function, allowing users to create voices based on simple scripts or prompts, adapting to different roles or characters.

Additionally, the “Acting Instructions” feature lets users fine-tune the emotional delivery of speech segments. A single line can be delivered in various styles—whispered, calm, or even disdainful—based on user instructions. This versatility makes Octave applicable in diverse fields such as education, entertainment, and customer service. The upcoming Voice Cloning feature will further enhance its capabilities by replicating voices from brief audio samples.

Performance Insights

Development and evaluation of Octave TTS prioritized both technical performance and practical application. An internal study involved 180 human raters evaluating Octave against a leading competitor. The study showed that Octave was preferred for audio quality (71.6%), naturalness (51.7%), and fidelity to descriptions (57.7%).

These results indicate that Octave not only delivers clear audio but also aligns better with user expectations regarding style and emotion. Moreover, the Expressive TTS Arena invites public participation to evaluate and compare different TTS systems, promoting ongoing improvements for Octave.

Conclusion

Hume’s Octave TTS enhances conventional TTS systems by focusing on context, emotion, and voice flexibility. Its ability to interpret and express emotional nuances leads to a more engaging auditory experience, beneficial across various applications. With a solid technical foundation and continual evaluation, Octave aims to set a new standard in expressive TTS without exaggerated claims, centering on practical advancements for developers and users alike.

For further details, check out the Technical Details. We appreciate the work of the researchers involved in this project. Follow us on Twitter and join our community of over 80k on ML SubReddit.

Explore how AI can transform your business: identify automation opportunities, determine meaningful KPIs, select customizable tools, and start small to assess AI effectiveness. For assistance in managing AI in your business, contact us at hello@itinai.ru or follow us on Telegram, X, and LinkedIn.


AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.