Amazon’s Alexa assistant recently to speak new languages globally: Hindi, U.S. Spanish, and Brazilian Portuguese. Synthetic data aided substantially in this, explained Amazon senior manager for research science Janet Slifka in a post on the Alexa this morning, but it wasn’t the end-all-be-all solution. They required new bootstrapping tools.
One of the tools in question was developed by Amazon’s Alexa AI Applied Modeling and Data Science group, and it uses a technique called grammar induction to analyze so-called golden utterances (i.e., canonical examples of customer requests proposed by Alexa feature teams) and produce a series of expressions that can generate similar sentences. The other — guided resampling — creates novel sentences by recombining words and phrases from examples in the available data, with an emphasis