Generating videos from whole cloth isn’t anything new for neural networks, or layers of mathematical functions modeled after biological neurons — researchers last week described a machine learning system capable of from start and end frames alone. But because of the inherent randomness, complexity, and information denseness of videos, modeling realistic ones at scale remains something of an AI grand challenge.
A team of scientists at Google Research say they’ve made progress, though, with novel networks that are able to produce “diverse” and “surprisingly realistic” frames from open-source video data sets at scale. They describe their method in a newly published paper on the preprint server Arxiv.org (““), and on a webpage containing selected samples of the model’s outputs.
“[We] find that our [AI] models