Growth in frontier AI models has slowed significantly, while smaller and midsized models are rapidly expanding in both capability and adoption, according to market intelligence firm Omdia.
The findings were published in the report titled AI Model Trends Spring 2026, which examines the current state of advanced AI systems and their compute requirements.
According to the analysis, parameter growth in frontier models has increased by only around 5 per cent annually since 2021.
This marks a sharp contrast with the period between 2019 and 2021, when model parameters expanded by more than a factor of 100.
At the same time, the definition of small AI models is evolving quickly, with models in the 7 billion to 14 billion parameter range increasingly replacing those in the 100 million category.
The report also highlighted the emergence of a midsized open source category, which is gaining traction across development, sentiment and adoption.
“In previous years, sustained slowdowns in AI model growth were typically associated with AI winters such as the 1980s, when the field faced systemic challenges,” said Alexander Harrowell, Senior Principal Analyst for Advanced Computing at Omdia.
“That is clearly not the case today, so something else is driving this shift,” he added.
“We believe much of this is linked to the rise of agents,” he said.
“Modern AI systems are increasingly deriving performance from tool use, effectively trading relatively inexpensive CPU compute for more costly GPU resources,” Harrowell explained.
“As a result, the CPU to GPU ratio is likely to move closer to 1 to 1,” he stated.
The report identified AI agents as a major factor behind changing compute demands, particularly due to their reliance on tool use and interaction workflows.
These agents are also driving demand for longer context windows, as all inputs, interactions and tool communications must pass through them.
This makes extended context capacity a critical requirement for modern AI systems.
As a result, context offload management is becoming increasingly important, with a new cache hierarchy spanning memory and high speed storage emerging to support these workloads.
However, this diversification across models running on GPUs, agents operating on CPUs and context management systems is expected to create pressure on data centre networks.
The report also found that midsized AI models are gaining traction due to their role as coordinators for agents in systems such as OpenClaw.
Their growing multimodal capabilities are further strengthening their adoption across use cases.
“There is also growing interest in mid range GPUs such as NVIDIA’s B40, both for these models and for the decode side of disaggregated inference architectures,” said Harrowell.
“However, the key competitive challenge for any new AI chip remains last year’s flagship GPU,” he added.
“Older GPUs are retaining value and remaining in service, as they continue to offer a cost effective option for small and midsized model inference and disaggregation,” he concluded.
Click here to change your cookie preferences