The first important generative models for images used an approach to artificial intelligence called a neural network — a program composed of many layers of computational units called artificial neurons. But even as the quality of their images got better, the models proved unreliable and hard to train. Meanwhile, a powerful generative model — created by a postdoctoral researcher with a passion for physics — lay dormant, until two graduate students made technical breakthroughs that brought the beast to life.
DALL·E 2 is such a beast. The key insight that makes DALL·E 2’s images possible — as well as those of its competitors Stable Diffusion and Imagen — comes from the world of physics. The system that underpins them, known as a diffusion model, is heavily inspired by nonequilibrium thermodynamics, which governs phenomena like the spread of fluids and gases. “There are a lot of techniques that were initially invented by physicists and now are very important in machine learning,” said Yang Song, a machine learning researcher at OpenAI.
Comments are closed.