Researchers from the University of Tokyo in collaboration with Aisin Corporation have demonstrated that universal scaling laws, which describe how the properties of a system change with size and scale, apply to deep neural networks that exhibit absorbing phase transition behavior, a phenomenon typically observed in physical systems. The discovery not only provides a framework describing deep neural networks but also helps predict their trainability or generalizability. The findings were published in the journal Physical Review Research.
In recent years, it seems no matter where we look, we come across artificial intelligence in one form or another. The current version of the technology is powered by deep neural networks: numerous layers of digital “neurons” with weighted connections between them. The network learns by modifying the weights between the “neurons” until it produces the correct output. However, a unified theory describing how the signal propagates between the layers of neurons in the system has eluded scientists so far.
“Our research was motivated by two drivers,” says Keiichi Tamai, the first author. “Partially by industrial needs as brute-force tuning of these massive models takes a toll on the environment. But there was a second, deeper pursuit: the scientific understanding of the physics of intelligence itself.”