Deep neural networks take time — and lots of data — to train, and that’s particularly true of speech recognition systems: conventional models tap corpora comprising thousands of hours of transcribed voice snippets. It’s not exactly surprising, then, that scientists at Amazon’s Alexa division are investigating ways to expedite the process, and they today report that they’ve made substantial headway.

In a and accompanying paper (“”), Minhua Wu, an applied scientist in the Alexa Speech group, and colleagues describe a speech recognizer that identifies data patterns in a semi-supervised fashion. By learning to make use of a few unlabeled samples, they claim that, trained on 800 hours of annotated data and 7,200 hours of “softly” unannotated data and with a second speech system fed the

Read More At Article Source | Article Attribution