Building Inclusive Speech Technology

Building Inclusive Speech Technology

Cover

The accent gap in speech technology occurs when speech recognition models are only trained on homogenous, sometimes biased, data. The issue can be solved by training models with diverse, bias-aware data. AI professionals must ensure that the data they train their models on accurately represents the wide variety of accents and speech patterns that are found in the real world.

Take Spanish-accented English speech data as an example: not all Spanish-accented data is created equal, as Spanish is spoken as a primary language in twenty-one countries. As a result, Spanish speakers in each state have a slightly different accent when speaking English.As a result, addressing the accent gap with training data becomes a more complex, nuanced task than what it seems.

Companies must consider accents and speech patterns from all geographies, demographics and social classes,or risk alienating a large portion of potential users. As we work towards building more inclusive AI, diverse, representative accented data is a key part of the equation.

Vendor:
DefinedCrowd
Posted:
Oct 12, 2021
Published:
Sep 17, 2021
Format:
PDF
Type:
White Paper
Already a Bitpipe member? Log in here

Download this White Paper!