Ecosystems

Insights

Get in touch

Home

Ecosystems

About

Insights

Get in touch

Home

Ecosystems

About

Insights

Get in touch

Home

Ecosystems

About

Insights

Get in touch

The Next Evolution of AI Infrastructure: Modular, Distributed, and Inference-First

24th November 2025

This article comes from our ‘AI Infrastructure as a Trojan Horse for Climate Infrastructure’ whitepaper, published October 2025.

TL;DR

AI is shifting from training to inference, and by 2030 most workloads will need smaller, faster, distributed data centres close to users and energy.

Modular inference centres cut latency and environmental impact while enabling heat reuse, lower water use, and seamless integration with local renewables.
With intentional design, these centres become climate infrastructure, anchoring clean power, supporting low-carbon materials, and delivering real community benefits.

The Arrival of the Inference Era

For the past few years, “AI infrastructure” has mostly meant one thing: massive clusters built for training ever-larger models. These billion-dollar compute farms and trillion-parameter breakthroughs have dominated the headlines. But they’re only half the story — and increasingly, the smaller half.

By 2030, as much as 75% of AI workloads are expected to come from inference — the real-time, user-facing side of AI that powers everything from autonomous vehicles and robotics to healthcare and live video applications. This shift is already clear in trends like the rise of test-time scaling and edge AI. Training creates intelligence; inference makes it useful in the world.

And critically: inference is reshaping the physical architecture of AI.

Unlike training, which benefits from scale and centralisation, inference demands something entirely different. It must be close to users, close to energy, and close to communities. It must respond in milliseconds, operate continuously rather than in scheduled cycles, and run inside applications where reliability is non-negotiable. These characteristics are explored in depth in this overview of the differences between training and inference. This shift is driving a quiet revolution: a move toward right-sized, modular, and distributed data centres.

Deep Learning Explained | Source

Why Inference Wants to Be Distributed

Inference behaves differently from training: it’s latency-sensitive, continuous, and embedded directly in user-facing applications. These characteristics make it far better suited to a distributed model, as outlined in this comparison of the deep learning workflow.

1. Lower latency, higher resilience

Putting compute closer to users cuts response times and reduces the impact of outages. A distributed footprint also keeps more data local, strengthening privacy and sovereignty, as captured in this overview of edge computing.

2. Turning externalities into community benefits

Modular centres can integrate circular systems from day one. Dry cooling cuts water use by up to 70%, highlighted in this analysis of circular cooling solutions, while heat reuse projects can feed pools, greenhouses, or district networks, as shown in examples of data-centre heat reuse.

3. Built to partner with renewables

Smaller centres can colocate with geothermal, hydro, or curtailed solar and wind, improving both economics and grid flexibility. Research on data centres in future energy systems shows how they can align workloads with renewable availability.

4. Faster, cheaper, lower-risk deployment

Modular designs can be deployed in weeks rather than years, scaled incrementally, and financed with lower upfront capital. This reduces stranded-asset risk, explored in this piece on the environmental costs of unused infrastructure.

5. A demographic and geopolitical imperative

High-growth regions like Africa and South Asia are seeing rapid population expansion, with projections from the UN on global demographic shifts and analyses of age distributions. Yet cloud infrastructure remains unevenly distributed, as shown in global population projection maps and data centre density comparisons. Distributed inference centres can help close this gap.

Top: UN population growth projections (2015-2100) (39). Bottom: number of data and cloud centres. Countries that have a higher number have a darker red colour. We see that the geography of data and cloud centres is uneven, with significant heterogeneity across counties. From both figures it is evident the difference between future growing demand and current deployment of data centres. | Top Source | Bottom Source

Inference-First Data Centres as Climate Infrastructure

The shift from training-first to inference-first architectures isn’t just a technical shift — it’s a generational opportunity.

Inference unlocks a new model: smaller, distributed compute that can be designed as climate-aligned, community-aligned infrastructure.

With intentional design, such as embedding Opna's core pillars of climate-aligned infrastructure, these centres can underwrite new clean energy projects, such as the multi-billion-dollar clean energy partnerships for data centres; drive demand for low-carbon materials, evidenced by growing investment in low-carbon cement production; scale carbon removal and water replenishment through emerging solutions like mineralisation-based CO₂ storage and large-scale water stewardship projects; and embed circular heat reuse in industries and communities, demonstrated by initiatives that use data centre heat to warm greenhouses.

This is the real promise of the inference era: infrastructure that is not only technologically efficient but socially and ecologically productive.

Inference-First Data Centres as Climate Infrastructure

If training demands centralisation, inference opens the door to distribution. The shift from training-first to inference-first architectures isn’t just a technical shift — it’s a generational opportunity. Inference unlocks a new model: smaller, distributed compute that can be designed as climate-aligned, community-aligned infrastructure.

This is the real promise of the inference era: infrastructure that is not only technologically efficient but socially and ecologically productive.