How to achieve large-scale AI deployment within the capacity limits of the power grid

AI‍‍‍‍

：Unsplash/Joshua Sortino

Rodrigo Liang

SambaNova

AI，AI
AI，，
AI，

（GenAI），，，AI，AI

AI，，，，，

，，20，（30）

AIGPU120，4.5，，

20245，2030，AI200，55，10000，AI

Clearly, we need more electricity. Some of the biggest companies, like Microsoft, Amazon, and Google, are either acquiring or building their own nuclear power plants to secure future energy supplies. While this is a smart move for organizations with the necessary resources, most businesses simply don’t have access to such assets. For the rest of us, corporate stakeholders and government policymakers must work together to identify the best ways to boost electricity capacity and modernize our aging grid infrastructure.

Driven by agentic AI, all these AI deployments will significantly boost productivity. Agentic AI accomplishes tasks through autonomous agents that collaborate with one another, requiring minimal or no human intervention at all. This type of AI is set to reshape the balance of AI workloads: in 2023, training still accounted for the majority of AI workloads, with a roughly 2:1 ratio compared to inference. However, there’s growing consensus that by 2025, as intelligent agents become more widespread, inference workloads will begin to match training loads—and by next year, inference is expected to surpass training, becoming the dominant AI workload with an increasingly significant advantage.

Electricity demand will outpace supply.

The computational power and electricity required to achieve this transformation will far surpass anything seen before. In many regions around the globe, demand will outpace the current grid’s capacity for both power generation and transmission. Recent estimates from areas like the United States, Europe, and the Asia-Pacific region—particularly in regions with a high concentration of data centers, such as Northern Virginia in the U.S., which is often cited as a prime example—indicate that electricity demand could exceed supply within just two to three years, unless substantial investments are made in upgrading grids and boosting power-generation capabilities. This scenario could force some data centers to operate below full capacity, delay the construction of others, and potentially lead to higher electricity bills for everyone.

Or will the situation be different? SambaNova integrates its self-developed chip into a hardware platform that delivers performance up to ten times greater than GPU-based solutions—while consuming just one-tenth the power. Such more efficient chips and hardware designs provide a valuable starting point for exploring how to enhance AI efficiency as the technology continues to evolve.

To ensure AI success, companies developing, deploying, and delivering AI solutions must leverage existing energy resources more intelligently and efficiently. This will require the entire AI ecosystem to adopt innovative new approaches.

Ways to Enhance AI Efficiency

Over the past year, open-source models—such as Meta’s Llama—have made it possible for companies to build customized models tailored to their specific needs. These pre-trained models are available for free download, allowing organizations to fine-tune them using vast amounts of existing data and integrate them with smaller, specialized models. Continuous improvements to these models mean they’re quickly closing the gap—and soon may even match the performance of proprietary foundation models. By leveraging open-source options, organizations can bypass the costly and energy-intensive process of training large-scale models altogether, instead opting to directly fine-tune and deploy their own sustainable, continuously enhanced solutions.

Additionally, fully leveraging the hardware that’s already in place is crucial. A robust ecosystem—comprising multiple companies, some of which are startups—is spearheading the adoption of innovative, software-based optimization solutions to ensure the most efficient AI workloads. This approach not only helps reduce overall operational costs but also contributes to easing the strain on the public power grid.

Finally, we can combine all these methods, open-source models, and more efficient hardware running in a fully optimized environment. This will boost AI productivity while making the most of existing power resources.

Developing ultra-efficient AI has been a recurring challenge in the tech industry over the past few decades, driven by multi-layered, continuous research and innovation. Achieving this goal requires a multifaceted approach—and together, we can make it happen.

The above content solely represents the author's personal views.This article is translated from the World Economic Forum's Agenda blog; the Chinese version is for reference purposes only.Feel free to share this in your WeChat Moments; please leave a comment at the end of the article or on our official account if you’d like to republish.

Translated by: Di Chenjing | Edited by: Wang Can

The World Economic Forum is an independent and neutral platform dedicated to bringing together diverse perspectives to discuss critical global, regional, and industry-specific issues.

"World Economic Forum"