Amazon Transcribe is a speech recognition service that now supports over 100 languages. It uses a speech foundation model that has been trained on millions of hours of audio data and delivers significant accuracy improvement. Companies like Carbyne use Amazon Transcribe to improve emergency response for non-English speakers. The service provides features like automatic punctuation, custom vocabulary, and speaker diarization. Users can get started with Amazon Transcribe by uploading media files to an Amazon S3 bucket. The service outputs transcriptions in text or itemized format. Overall, Amazon Transcribe enables enterprises to unlock insights from audio content and improve content accessibility.
Introducing Amazon Transcribe’s New Speech Foundation Model-Powered ASR System
Amazon Transcribe is an automatic speech recognition (ASR) service that allows you to easily add speech-to-text capabilities to your applications. We are excited to announce a next-generation multi-billion parameter speech foundation model-powered system that expands automatic speech recognition to over 100 languages.
Benefits of the New ASR System
The new ASR system offers several key benefits:
- Significant accuracy improvement between 20% and 50% across most languages
- Accuracy improvement between 30% and 70% on telephony speech
- Improved readability with more accurate punctuation and capitalization
- Support for over 100 languages
- Features such as automatic punctuation, custom vocabulary, automatic language identification, speaker diarization, word-level confidence scores, and custom vocabulary filter
- Expanded support for different accents, noise environments, and acoustic conditions
Real-World Use Case: Carbyne
Carbyne, a software company that develops cloud-based contact center solutions for emergency call responders, uses Amazon Transcribe to improve emergency response for non-English speakers. By leveraging the new multilingual foundation model-powered ASR, Carbyne can democratize life-saving emergency services and ensure that every person counts.
How to Get Started
To get started with Amazon Transcribe, you can use the AWS Command Line Interface (AWS CLI), AWS Management Console, or various AWS SDKs. Simply upload your media files to an Amazon S3 bucket and choose to save your transcript in your own bucket or use a secure default bucket. You can access the speech foundation model-powered speech recognition without any changes to the API endpoint or input parameters.
Transcription Output
Amazon Transcribe provides transcription output in JSON format. The output includes the transcript in both text and itemized formats, as well as additional metadata such as speaker labels, channel labels, items, and segments.
Conclusion
With the expanded language support in Amazon Transcribe, businesses can serve users from diverse linguistic backgrounds, enhance accessibility, and enable global communication and information exchange. To learn more about the features and benefits of Amazon Transcribe, visit our features page and what’s new post.
Evolve Your Company with AI
If you want to stay competitive and leverage AI to your advantage, consider using Amazon Transcribe’s new speech foundation model-powered ASR system. Discover how AI can redefine your way of work by identifying automation opportunities, defining measurable KPIs, selecting the right AI solution, and implementing gradually. For AI KPI management advice, connect with us at hello@itinai.com. Stay updated on leveraging AI by following us on Telegram t.me/itinainews or Twitter @itinaicom.
Spotlight on a Practical AI Solution: AI Sales Bot
Explore the AI Sales Bot from itinai.com/aisalesbot, designed to automate customer engagement and manage interactions across all customer journey stages. Discover how AI can redefine your sales processes and customer engagement. Visit itinai.com to learn more.