Exclusive: Speech recognition AI learns industry jargon with aiOla’s novel approach

by | Jul 3, 2024 | Technology

Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More

Speech recognition is a critical part of multimodal AI systems. Most enterprises are racing to implement the technology, but even after all the advancements to date, many speech recognition models out there can fail to understand what a person is saying. Today, aiOla, an Israeli startup specializing in this field, took a major step towards solving this problem by announcing an approach that teaches these models to understand industry-specific jargon and vocabulary.

The development enhances the accuracy and responsiveness of speech recognition systems, making them more suitable for complex enterprise settings –  even in challenging acoustic environments. As an initial case study, the startup adapted OpenAI’s famous Whisper model with its technique, reducing its word error rate and improving overall detection accuracy.

However, it says it can work with any speech rec model, including Meta’s MMS model and proprietary models, unlocking the potential to elevate even the highest-performing speech-to-text models.

The problem of jargon in speech recognition

Over the last few years, deep learning on hundreds of thousands of hours of audio has enabled the rise of high-performing automatic speech recognition (ASR) and transcription systems. OpenAI’s Whisper, one such breakthrough model, made particular headlines in the field with its ability to match human-level robustness and accuracy in English speech recognition. 

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

However, since its launch in 2022, many have noted that despite being as good as a human listener, Whisper’s recognition performance cou …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More

Speech recognition is a critical part of multimodal AI systems. Most enterprises are racing to implement the technology, but even after all the advancements to date, many speech recognition models out there can fail to understand what a person is saying. Today, aiOla, an Israeli startup specializing in this field, took a major step towards solving this problem by announcing an approach that teaches these models to understand industry-specific jargon and vocabulary.

The development enhances the accuracy and responsiveness of speech recognition systems, making them more suitable for complex enterprise settings –  even in challenging acoustic environments. As an initial case study, the startup adapted OpenAI’s famous Whisper model with its technique, reducing its word error rate and improving overall detection accuracy.

However, it says it can work with any speech rec model, including Meta’s MMS model and proprietary models, unlocking the potential to elevate even the highest-performing speech-to-text models.

The problem of jargon in speech recognition

Over the last few years, deep learning on hundreds of thousands of hours of audio has enabled the rise of high-performing automatic speech recognition (ASR) and transcription systems. OpenAI’s Whisper, one such breakthrough model, made particular headlines in the field with its ability to match human-level robustness and accuracy in English speech recognition. 

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

However, since its launch in 2022, many have noted that despite being as good as a human listener, Whisper’s recognition performance cou …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This