Speechmatics' new improvement to speech recognition software

Speechmatics has announced a new functionality to its autonomous speech recognition software that tackles ML challenges in speech recognition

Leading speech recognition technology company, Speechmatics, has made a big step as it looks to deliver truly comprehensive speech recognition with the addition of a new functionality.

The new Entity Formatting addition to the software makes using speech recognition technology significantly more valuable to enterprise-level customers, where there is a higher dependency on the consistent and appropriate formatting of numbers in text, such as those in media, financial services, and healthcare.

A big challenge for machine learning (ML) is Entity Formatting, but by using Inverse Text Normalisation (ITN), the software now has the ability to consistently and more accurately understand how entities such as numbers, currencies, percentages, addresses, dates, and times feature in a transcript.

Now, these entities should appear in written form making transcripts more readable and reduces post-processing work. 

“Creating a more professional transcript will speed up our customers’ workflows by making large numbers easier to read, requiring less human editing. Context is also critical – there are so many nuances and ambiguities that need to be accounted for in language, such as whether ‘pounds’ is a reference to weight or currency? And whether ‘venti’ is being used as the Italian word for 20 or winds?,” said the Speechmatics engine CEO Katy Wigdahl.

Youtube Placeholder

Speechmatics: overcoming major challenges for speech recognition software developers

Entity Formatting is difficult in speech recognition because the way that entities are spoken in conversation varies – even between countries that speak the same language – which adds layers of complexity. 

One clear example of this is telephone numbers as people might use ‘oh’ instead of ‘zero’ or use double/triple digits such as ‘triple three’. Entity Formatting within Speechmatics’ software means there will be fewer errors in the transcription service it offers.

Wigdahl adds: “This new functionality in our breakthrough Autonomous Speech Recognition will have a decisive impact on our customers working in numerically intensive industries. Entity Formatting has always been a notoriously challenging task for speech recognition but with this latest update we are delivering best-in-market functionality and bringing significant value to our customers operating in industries where getting numbers right for speech-to-text tasks is mission-critical.” 

This announcement marks a big advancements for Speechmatics and its autonomous speech recognition. The company’s technology, which is trained on huge amounts of unlabelled data without the need for human intervention, is now able to deliver a far more comprehensive understanding of all voices and dramatically reducing AI bias and errors in speech recognition.   


Featured Articles

What Dell and Super Micro can Bring Musk’s xAI Supercomputer

Elon Musk's xAI partnership with server hosting titans Dell and Super Micro could see his ambition for 'the world's largest supercomputer' lift off

Toshiba Takes Another Step to Ushering in Embodied AI

Toshiba's Cambridge Research Lab has announced two breakthroughs in Embodied AI alongside a new group to renew focus on the tech

Why AWS is Investing $230m in Credits for Gen AI Startups

Amazon is investing US$230m in AWS cloud credits to entice Gen AI startups to get onboard with using its cloud services

How Retrieval Augmented Generation (RAG) Enhances Gen AI

AI Applications

Synechron’s Prag Jaodekar on the UK's AI Regulation Journey

AI Strategy

LGBTQ+ in AI: Vivienne Ming and the Human Power of AI

Machine Learning