Toggle light / dark theme

Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scal-ing properties, and limitations remain insufficiently understood. Current evaluations primarily fo-cus on established mathematical and coding benchmarks, emphasizing final answer accuracy. How-ever, this evaluation paradigm often suffers from data contamination and does not provide insights into the reasoning traces’ structure and quality. In this work, we systematically investigate these gaps with the help of controllable puzzle environments that allow precise manipulation of composi-tional complexity while maintaining consistent logical structures. This setup enables the analysis of not only final answers but also the internal reasoning traces, offering insights into how LRMs “think”. Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counter-intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget. By comparing LRMs with their standard LLM counterparts under equivalent inference compute, we identify three performance regimes: low-complexity tasks where standard models surprisingly outperform LRMs, medium-complexity tasks where additional thinking in LRMs demonstrates advantage, and high-complexity tasks where both models experience complete collapse. We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles. We also investigate the reasoning traces in more depth, studying the patterns of explored solutions and analyzing the models’ computational behavior, shedding light on their strengths, limitations, and ultimately raising crucial questions about their true reasoning capabilities.

*Equal contribution. †Work done during an internship at Apple.

Everything in nature has a geometric pattern—from the tiger’s stripes and spirals in flowers to the unique fingerprints of each human being. While these patterns are sometimes symmetrical, most of such patterns lack symmetry, which leaves us with one major question: How do such unsymmetrical patterns emerge in nature?

Studies report that drying environments cause water evaporation and can lead to the formation of asymmetric patterns during biological growth through a phenomenon called “ breaking.” Although reported through mathematical studies, these studies lack physical-chemical experiments that replicate this phenomenon.

A recent study at the Japan Advanced Institute of Science and Technology (JAIST), led by Associate Professor Kosuke Okeyoshi and doctoral student Thi Kim Loc Nguyen, uncovers the mechanisms behind symmetry breaking during a process called meniscus splitting in evaporating polymer solutions. The findings of the study were published in Advanced Science on June 3, 2025.

For decades, we’ve thought the control center of life lies in DNA. But a new scientific framework is emerging that challenges that idea, and suggests that vast portions of the genome are immaterial and lie outside the physical world. Today, physicist Dr. Brian Miller shares his perspective on the cutting-edge, potentially revolutionary research of mathematical biologist Dr. Richard Sternberg on the immaterial aspects of the genome. In this exchange, Dr. Miller shares several examples of the immaterial nature of life. These ideas point towards the earliest stages of the next great scientific revolution and have significant implications for the intelligent design debate.

The universe speaks in mathematics, yet we experience it in poetry. This fundamental paradox — that objective quantities somehow give rise to subjective qualities — represents perhaps the most profound mystery in the architecture of consciousness. At the precise intersection where measurable physical magnitudes transform into felt experience lies perception itself, functioning as the universe’s most elegant translation device, converting the quantitative substrate of reality into the qualitative texture of conscious life.

Consider the photon, that discrete packet of electromagnetic energy oscillating at precisely 550 nanometers. Physics describes it with mathematical precision: wavelength, frequency, amplitude — pure quantity divorced from any subjective dimension. Yet when this photon encounters the rhodopsin molecules within our retinal cells, something extraordinary occurs. The quantitative description remains accurate but suddenly insufficient. The same electromagnetic radiation that physics measures as wavelength 550nm becomes, through the alchemy of perception, the irreducible experience we call “green.” This transformation represents not merely a change in descriptive language but a fundamental ontological shift — the emergence of an entirely new category of being.

Maurice Merleau-Ponty recognized this threshold when he observed that “the body is our general medium for having a world” (Merleau-Ponty, 1945/2012, p. 147). The lived body serves as the crucial mediator between the quantitative realm that physics describes and the qualitative realm that consciousness inhabits. Through our sensorimotor engagement with the world, objective magnitudes undergo a metamorphosis into subjective meanings. The body is not merely a receiver of information but an active participant in the creation of experiential reality itself.

A fresh study suggests that the way a person’s pupils change while they concentrate hints at how well that mental scratchpad is working.

Working memory does more than hold stray reminders; it stitches together phone digits until they are dialed, keeps track of a spoken sentence until the meaning lands, and buffers half-finished ideas during problem-solving.

Unlike long-term memory, it works on a tight clock measured in seconds. Because the capacity is finite – typically three to seven items at once – small differences in efficiency can ripple through reading, mathematics, and decision-making.

Statistical mechanics is one of the pillars of modern physics. Ludwig Boltzmann (1844–1906) and Josiah Willard Gibbs (1839–1903) were its primary formulators. They both worked to establish a bridge between macroscopic physics, which is described by thermodynamics, and microscopic physics, which is based on the behavior of atoms and molecules.

The Austrian physicist Boltzmann explained the second law of thermodynamics in statistical terms. He defined the entropy of a system based on the number of possible microstates it could assume.

Unlike Boltzmann, who focused more on the physics of gases and the distribution of particles in equilibrium, the American Gibbs developed a general mathematical formalism that could be extended to more complex systems. Together, their contributions formed the basis of a physics capable of modeling a wide variety of phenomena.

What If Math uses a relatively new concept to enhance the way math is taught so that kids are given more relevant skills for today’s digital world.

The company says that the way math — and algebra specifically — is taught today is based on a concept developed by Leonardo of Pisa in 1202 as a way to help traders. This, it says, is now redundant thanks to all the digital tools that use spreadsheets to do that part of mathematical working.