Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

By David Erlich, Consulting Director, Sofrecom

Between 2016 and the end of 2020, data center consumption trajectories seemed under control, stabilizing around 200 TWh (excluding Bitcoin). However, a turning point occurred.

The International Energy Agency (IEA) estimates that consumption has accelerated again, reaching 415 TeraWatt-hours (TWh) in 2024. By 2030, consumption, including cryptocurrencies, is projected to reach 945 TWh.

Although data centers developed quickly before the emergence of ChatGPT, generative AI (GenAI) has accelerated the trend. At the same time, cryptocurrencies regained strong momentum in 2024, accounting for approximately 20% of data center consumption.

Why is generative AI under scrutiny?

Energy-Intensive Generative AI

Firstly, large language models (LLMs) require extensive "training" on vast amounts of data. The creation of each model is extremely costly. GPT-3 reportedly required 1.3 GWh, while GPT-4 needed over 50 GWh. The training cost is closely tied to the model size (tenfold) and the amount of data used.

A true model building frenzy has taken over the tech sector. In the last quarter of 2024, nearly 80 trillion tokens were trained. Since early 2024, we have been advancing at a relatively constant rate, training 20 major models and 10,000 derived models per month (according to estimates from LifeArchitect.ai’s Models Table).

Nevertheless, a model like GPT-4 is estimated to have already absorbed a significant portion of the useful written knowledge available. For example, the Library of Congress could represent 9 trillion tokens, whereas the largest models are trained on over 20 trillion tokens. So, why continue training new models?

On the one hand, these new models incorporate audio and images (known as multimodal AI) and produce more complex reasoning. They, thus, become increasingly precise as concrete use cases become more effective.

On the other hand, as it remains unclear exactly how these models function (they are "black boxes"), stakeholders adopt an empirical approach and continuously create new models, even to incorporate minor changes.

Today, numerous startups are attracting funding, fueling this training inflation in pursuit of "the best" model.

The second energy-consuming aspect of generative AI is the querying process, called inference. A textual query can require 3 Wh (already ten times more than a standard Google query). In the future, one minute of video production (from a multimodal query) could demand over 100 Wh. Therefore, part of data center computational power will be dedicated to model inference. Various analyses estimate that 70-80% of energy could be spent on inference.

Donald Trump's announcement of the Stargate project, aiming to construct USD 500 billion worth of computational capacity in the United States, asserts that generative AI leaders will be those possessing the largest computational capacities, enabling more models (and controlling access to knowledge) and more inference (benefiting from the most sophisticated uses). France, meanwhile, announced EUR 109 billion, ahead of the International AI Summit in Paris.

Limiting Growth Factors

NVIDIA supplies most chips used by generative AI. NVIDIA experienced substantial growth at the beginning of 2023 (shortly after ChatGPT's launch), now doubling annually. It is estimated that, in 2023, the company produced over 3.7 million chips for data centers, including hundreds of thousands of the powerful H100 model. This would represent about 20 TWh of consumption, including the manufacturing-related energy use, roughly double the previous year.

Microprocessor production was immediately identified as a bottleneck. It is also managed as a strategic asset, influenced by the China-US rivalry, and the versions sold in China are not the most advanced. NVIDIA's main clients in this segment are hyperscalers (Amazon, Facebook, Google), which aim to eventually develop their own chips, potentially reducing supply tensions.

Electricity production capacity risks becoming another limiting factor. According to a McKinsey study, data center demand is expected to account for 30-40% of new electrical capacities by 2030, within the context of growing electric energy demand, preferably decarbonized (electric heating, electric vehicles (EVs), electrolysis).

Additionally, specific problems may arise, such as the ease of building data centers (permits, availability of conditioning equipment) or the feasibility of grid connections. In Ireland, data centers represent 21% of electrical consumption, leading EirGrid, the Irish operator, to suspend new data center construction near Dublin.

Data center users are considering more autonomous solutions, even contemplating small nuclear reactors (SNR), which are miniature nuclear reactors still in the design phase.

Data centers also consume a significant amount of water for cooling, which can be problematic. A medium-sized data center consumes one million liters of water per day, roughly the amount needed to irrigate approximately 70 hectares of crops.

An Uncertain Future

Data center development now competes with other vital infrastructures such as transportation, housing, and agriculture. The primary limitation on data center growth could, thus, stem from inadequate electrical infrastructure. SNRs, if realized, would not be operational for another 15 years.

Moreover, faced with surging costs and uncertainty around generative AI revenue sources, stakeholders are seeking more energy-efficient models. Particularly, inference can be executed on less energy-intensive processors than training. Smaller models also mean lower inference costs. Some also challenge the race for large models, advocating for more distributed architectures.

This pursuit of sobriety is also prevalent among Chinese actors who lack access to NVIDIA's most powerful chips. An initial response may have been provided in early 2025 by the Chinese company, DeepSeek, which announced a conversational bot as effective as leading American models, with training costs of only USD 6 million, compared to hundreds of millions for American technology.

Thus, a combination of factors, including profitability, energy availability, and geostrategic considerations might curb the frantic race for computational power. Ultimately, business rationality may encourage more sustainable energy practices, leading to smarter artificial intelligence (AI).

Pin It