Speechmatics outperforms all tech giants in tackling AI bias

Speech recognition company, Speechmatics, has launched its Autonomous Speech Recognition software that outperforms Amazon, Apple, Google and Microsoft

This new software by Speechmatics uses the latest techniques in deep learning and contains the company’s breakthrough self-supervised models. The launch of this software marks Speechmatics’ latest step towards its mission to understand all voices.

Based on datasets used in Stanford’s ‘Racial Disparities in Automated Speech Recognition’ study, Speechmatics recorded an overall accuracy of 82.8% for African American voices compared to Google (68.7%) and Amazon (68.6%). 

Equating to a 45% reduction in speech errors, or three words in an average sentence, Speechmatics’ Autonomous Speech Recognition delivers similar improvements in accuracy across accents, dialects, age, and other sociodemographic characteristics.

“We are on a mission to deliver the next generation of machine learning capabilities, and through that offer more inclusive and accessible speech technology. This announcement today is a huge step towards achieving that mission,” said Katy Wigdahl, CEO of Speechmatics. 

“Our focus in tackling artificial intelligence (AI) bias has led to this monumental leap forward in the speech recognition industry and the ripple effect will lead to changes in a multitude of different scenarios. Think of the incorrect captions we see on social media, court hearings where words are mistranscribed and eLearning platforms that have struggled with children’s voices throughout the pandemic. Errors people have had to accept until now can have a tangible impact on their daily lives,” she added. 

Speechmatics: overcoming challenges in speech recognition bias

Misunderstanding in speech recognition has historically been commonplace due to the limited amount of labelled data available to train on. This is because labelled data needs to be manually classified by humans.

As a result, it not only limits the amount of data available for training but also limits the representation of all voices.

Trained on a huge amount of unlabelled data direct from the internet, such as social media content and podcasts, Speechmatics’ technology overcomes this challenge and delivers a far more comprehensive representation of all voices.

With this comprehensive representation, the technology dramatically reduces AI bias and errors in speech recognition.

Allison Koenecke, lead author of the Stanford study on speech recognition commented on the importance of eradicating this bias: “It's critical to study and improve fairness in speech-to-text systems given the potential for disparate harm to individuals through downstream sectors ranging from healthcare to criminal justice.”  

Speechmatics also outperforms competitors on children’s voices, the company recorded 91.8% accuracy compared to Google (83.4%) and Deepgram (82.3%) based on the open-source project Common Voice

Speech recognition in children’s voices is notoriously challenging using legacy speech recognition technology.


Featured Articles

What Dell and Super Micro can Bring Musk’s xAI Supercomputer

Elon Musk's xAI partnership with server hosting titans Dell and Super Micro could see his ambition for 'the world's largest supercomputer' lift off

Toshiba Takes Another Step to Ushering in Embodied AI

Toshiba's Cambridge Research Lab has announced two breakthroughs in Embodied AI alongside a new group to renew focus on the tech

Why AWS is Investing $230m in Credits for Gen AI Startups

Amazon is investing US$230m in AWS cloud credits to entice Gen AI startups to get onboard with using its cloud services

How Retrieval Augmented Generation (RAG) Enhances Gen AI

AI Applications

Synechron’s Prag Jaodekar on the UK's AI Regulation Journey

AI Strategy

LGBTQ+ in AI: Vivienne Ming and the Human Power of AI

Machine Learning