Researchers from UCSD and Adobe Introduce Presto!: An AI Approach to Inference Acceleration for Score-based Diffusion Transformers via Reducing both Sampling Steps and Cost Per Step

Researchers from UCSD and Adobe Introduce Presto!: An AI Approach to Inference Acceleration for Score-based Diffusion Transformers via Reducing both Sampling Steps and Cost Per Step

Text-to-Audio and Text-to-Music Innovations

Recent advancements in Text-to-Audio (TTA) and Text-to-Music (TTM) technologies have been driven by new audio models. These models outperform older methods like GANs and VAEs in creating high-quality audio. However, they struggle with long processing times, taking between 5 to 20 seconds for each operation, which limits their use in real-time applications.

Challenges and Solutions

Current methods to improve TTA and TTM mainly focus on autoregressive techniques and diffusion models. While diffusion methods excel in generating detailed audio, their slow speed is a major drawback for interactive use. Techniques like step distillation aim to speed up the process by reducing the number of steps needed. However, these methods often fall short for longer or higher-quality audio.

Introducing Presto!

Researchers from UC San Diego and Adobe have developed Presto!, a groundbreaking method that speeds up TTM generation. Presto! reduces processing time by minimizing the number of sampling steps and costs associated with each step. It features a unique score-based distribution matching distillation technique, the first of its kind for TTM, enhancing efficiency significantly.

How Presto! Works

Presto! uses a latent diffusion model to create high-quality audio. It generates mono audio at 44.1kHz, which is then converted to stereo. The model is trained on a large dataset of instrumental music and employs various techniques to improve audio quality. Performance is evaluated using metrics that measure audio quality and adherence to text prompts.

Performance Highlights

Presto! comes in two versions: Presto-S and Presto-L. Presto-L outperforms baseline models, achieving a 27% increase in speed while improving audio quality. Presto-S also excels, providing a 15 times speedup while maintaining high audio quality. Together, they achieve impressive latencies, making them significantly faster than existing solutions.

Future Directions

The researchers hope that Presto! will inspire further innovations in AI audio generation by merging different distillation techniques for better performance across various media.

Get Involved

For more details, check out the research paper. Follow us on Twitter, join our Telegram Channel, and connect with us on LinkedIn. If you appreciate our work, subscribe to our newsletter and join our 50k+ ML SubReddit community.

Upcoming Event

RetrieveX – The GenAI Data Retrieval Conference on Oct 17, 2023.

Transform Your Business with AI

Stay competitive by leveraging AI in your operations:

  • Identify Automation Opportunities: Find areas where AI can enhance customer interactions.
  • Define KPIs: Measure the impact of your AI initiatives on business outcomes.
  • Select an AI Solution: Choose tools that fit your needs and allow for customization.
  • Implement Gradually: Start small, gather data, and expand your AI use wisely.

For AI KPI management advice, contact us at hello@itinai.com. For ongoing insights, follow us on Telegram or Twitter.

Enhance Your Sales and Customer Engagement with AI

Explore innovative solutions at itinai.com.

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.