ChatGPT Burns Millions Every Day. Can Computer Scientists Make AI One Million Times More Efficient?

Running ChatGPT costs millions of dollars a day, which is why OpenAI, the company behind the viral natural-language processing artificial intelligence has started ChatGPT Plus, a $20/month subscription plan. But our brains are a million times more efficient than the GPUs, CPUs, and memory that make up ChatGPT’s cloud hardware. And neuromorphic computing researchers are working hard to make the miracles that big server farms in the clouds can do today much simpler and cheaper, bringing them down to the small devices in our hands, our homes, our hospitals, and our workplaces.

One of the keys: modeling computing hardware after the computing wetware in human brains.

“Inference costs far exceed training costs when deploying a model at any reasonable scale,” say Dylan Patel and Afzal Ahmad in SemiAnalysis. “In fact, the costs to inference ChatGPT exceed the training costs on a weekly basis. If ChatGPT-like LLMs are deployed into search, that represents a direct transfer of $30 billion of Google’s profit into the hands of the picks and shovels of the computing industry.”

If you run the numbers like they have, the implications are staggering.

“Deploying current ChatGPT into every search done by Google would require 512,820 A100 HGX servers with a total of 4,102,568 A100 GPUs,” they write. “The total cost of these servers and networking exceeds $100 billion of Capex alone, of which Nvidia would receive a large portion.”

Blog