Toggle light / dark theme

A UC San Diego team has uncovered a method to decipher neural networks’ learning process, using a statistical formula to clarify how features are learned, a breakthrough that promises more understandable and efficient AI systems. Credit: SciTechDaily.com.

Neural networks have been powering breakthroughs in artificial intelligence, including the large language models that are now being used in a wide range of applications, from finance, to human resources to healthcare. But these networks remain a black box whose inner workings engineers and scientists struggle to understand. Now, a team led by data and computer scientists at the University of California San Diego has given neural networks the equivalent of an X-ray to uncover how they actually learn.

The researchers found that a formula used in statistical analysis provides a streamlined mathematical description of how neural networks, such as GPT-2, a precursor to ChatGPT, learn relevant patterns in data, known as features. This formula also explains how neural networks use these relevant patterns to make predictions.

We’ve all been there: staring at a math test with a problem that seems impossible to solve. What if finding the solution to a problem took almost a century? For mathematicians who dabble in Ramsey theory, this is very much the case. In fact, little progress had been made in solving Ramsey problems since the 1930s.

Now, University of California San Diego researchers Jacques Verstraete and Sam Mattheus have found the answer to r(4,t), a longstanding Ramsey problem that has perplexed the math world for decades.

“We found to our great surprise that this substrate is very much active, jiving and responding in completely surprising ways as the film switches from an insulator to a metal and back when the electrical pulses arrive,” Gopalan said. “This is like watching the tail wagging the dog, which stumped us for a long while. This surprising and previously overlooked observation completely changes how we need to view this technology.”

To understand these findings, the theory and simulation effort — led by Long-Qing Chen, Hamer Professor of Materials Science and Engineering, professor of engineering science and mechanics and of mathematics at Penn State — developed a theoretical framework to explain the entire process of the film and the substrate bulging instead of shrinking. When their model incorporated naturally occurring missing oxygen atoms in this material of two types, charged and uncharged, the experimental results could be satisfactorily explained.

“These neutral oxygen vacancies hold a charge of two electrons, which they can release when the material switches from an insulator to a metal,” Gopalan said. “The oxygen vacancy left behind is now charged and the crystal swells up, leading to the observed surprising bulging in the device. This response can also happen in the substrate. All of these physical processes are beautifully captured in the phase-field theory and modeling performed in this work for the first time by the postdoc Yin Shi in Prof. Chen’s group.”

From U tubingen and cambridge U

Wu’s Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry https://arxiv.org/abs/2404.

- Wu’s…


Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematics.

Reflecting the Olympic spirit of ancient Greece, the International Mathematical Olympiad is a modern-day arena for the world’s brightest high-school mathematicians. The competition not only showcases young talent, but has emerged as a testing ground for advanced AI systems in math and reasoning.

In a paper published today in Nature, we introduce AlphaGeometry, an AI system that solves complex geometry problems at a level approaching a human Olympiad gold-medalist — a breakthrough in AI performance. In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit. For comparison, the previous state-of-the-art system solved 10 of these geometry problems, and the average human gold medalist solved 25.9 problems.

Nothing makes a mess of quantum physics quite like those space-warping, matter-gulping abominations known as black holes. If you want to turn Schrodinger’s eggs into an information omelet, just find an event horizon and let ‘em drop.

According to theoretical physicists and chemists from Rice University and the University of Illinois Urbana-Champaign in the US, basic chemistry is capable of scrambling quantum information almost as effectively.

The team used a mathematical tool developed more than half a century ago to bridge a gap between known semiclassical physics and quantum effects in superconductivity. They found the delicate quantum states of reacting particles become scrambled with surprising speed and efficiency that comes close to matching the might of a black hole.

Machine learning techniques may appear ill-suited for application in fields that prioritize rigor and deep understanding; however, they have recently found unexpected uses in theoretical physics and pure mathematics. In this Perspective, Gukov, Halverson and Ruehle have discussed rigorous applications of machine learning to theoretical physics and pure mathematics.

“Gravity pulls matter together, so that when we throw a ball in the air, the Earth’s gravity pulls it down toward the planet,” Mustapha Ishak-Boushaki, a professor of physics in the School of Natural Sciences and Mathematics (NSM) at UT Dallas, and member of the DESI collaboration, said in a statement. “But at the largest scales, the universe acts differently. It’s acting like there is something repulsive pushing the universe apart and accelerating its expansion. This is a big mystery, and we are investigating it on several fronts. Is it an unknown dark energy in the universe, or is it a modification of Albert Einstein’s theory of gravity at cosmological scales?”

DESI’s data, however, shows that the universe may have evolved in a way that isn’t quite consistent with the Lambda CDM model, indicating that the effects of dark energy on the universe may have changed since the early days of the cosmos.

“Our results show some interesting deviations from the standard model of the universe that could indicate that dark energy is evolving over time,” Ishak-Boushaki said. “The more data we collect, the better equipped we will be to determine whether this finding holds. With more data, we might identify different explanations for the result we observe or confirm it. If it persists, such a result will shed some light on what is causing cosmic acceleration and provide a huge step in understanding the evolution of our universe.”

The traveling salesman problem is considered a prime example of a combinatorial optimization problem. Now a Berlin team led by theoretical physicist Prof. Dr. Jens Eisert of Freie Universität Berlin and HZB has shown that a certain class of such problems can actually be solved better and much faster with quantum computers than with conventional methods.

Quantum computers use so-called qubits, which are not either zero or one as in conventional logic circuits, but can take on any value in between. These qubits are realized by highly cooled atoms, ions, or superconducting circuits, and it is still physically very complex to build a quantum computer with many qubits. However, mathematical methods can already be used to explore what fault-tolerant quantum computers could achieve in the future.

“There are a lot of myths about it, and sometimes a certain amount of hot air and hype. But we have approached the issue rigorously, using mathematical methods, and delivered solid results on the subject. Above all, we have clarified in what sense there can be any advantages at all,” says Prof. Dr. Jens Eisert, who heads a joint research group at Freie Universität Berlin and Helmholtz-Zentrum Berlin.

Despite being almost a year old, this blog by Chip Huyen is still a great read for getting into fine-tuning LLMs.

This article covers everything you need to know about Reinforcement Learning from Human Feedback (RLHF).

#AI #ReinforcementLearning


A narrative that is often glossed over in the demo frenzy is about the incredible technical creativity that went into making models like ChatGPT work. One such cool idea is RLHF: incorporating reinforcement learning and human feedback into NLP. This post discusses the three phases of training ChatGPT and where RLHF fits in. For each phase of ChatGPT development, I’ll discuss the goal for that phase, the intuition for why this phase is needed, and the corresponding mathematical formulation for those who want to see more technical detail.