Why downsizing large language models is the future of generative AI

Smaller language models can be based on a billion parameters or less—still pretty large, but much smaller than foundational LLMs like ChatGPT and Bard. They are pre-trained to understand vocabulary and human speech, so the incremental cost to customize them using corporate and industry-specific data is vastly lower. There are several options for these pre-trained LLMs that can be customized internally, including AI21 and Reka, as well as open source LLMs like Alpaca and Vicuna.

Smaller language models aren’t just more cost-efficient, they’re often far more accurate, because instead of training them on all publicly available data—the good and the bad—they are trained and optimized on carefully vetted data that addresses the exact use cases a business cares about.

That doesn’t mean they’re limited to internal corporate data. Smaller language models can incorporate third-party data about the economy, commodities pricing, the weather, or whatever data sets are needed, and combine them with their proprietary data sets. These data sources are widely available from data service providers who ensure the information is current, accurate, and clean.

Blog