Personal assistants like Apple’s Siri accomplish tasks through natural language commands. However, their underlying components often rely on supervised machine learning algorithms requiring large amounts of hand-annotated training data. In an attempt to reduce the time and effort taken to collect this data, researchers at Apple developed a framework that leverages user engagement signals to automatically create data-augmenting labels. They report that when incorporated using strategies like multi-task learning and validation with an external knowledge base, the annotated data significantly improve accuracy in a production deep learning system.
“We believe this is the first use of user engagement signals to help generate training data for a sequence labeling task on a large scale, and can be applied in practical settings to speed up new feature