Chapter 15.1
Efficiency Metrics: PUE, WUE, ERF, REF & the Post-PUE Metric Stack
PUE was built to expose wasted overhead in an air-cooled hall; in a liquid-cooled AI factory it quietly rewards moving losses inside the IT boundary and ignores the water, carbon, and useful-work questions that now decide whether a campus gets permitted — so the metric you optimize is itself a decision with consequences.
What you'll decide here
- Which metric you commission your facility against — PUE alone (and inherit its liquid-cooling blind spot) versus a stack that pins PUE, WUE, ERF/ERE, REF, CUE, and a work-based denominator (DCeP / tokens-per-MWh) so no single number can be gamed.
- Where you draw the measurement boundary — site versus source — because WUE-site flatters an evaporative tower while WUE-source exposes the water embedded in the megawatts it never reports.
- Whether you report TUE/ITUE alongside PUE, since direct-to-chip liquid shifts fan and pump power across the IT line and makes a worse facility look more efficient on PUE alone.
- What you put in the denominator of every efficiency ratio — nameplate IT, measured IT, or useful work — because a half-idle cluster posts a beautiful PUE while doing almost nothing.
- Which numbers are contractual or regulatory (EED Article 12, ISO/IEC 30134, customer SLAs) versus which are marketing, and whether your instrumentation plan can survive an audit rather than a press release.
Power Usage Effectiveness is the most quoted number in the industry and the most quietly broken. It was a brilliant intervention in 2007: a single ratio — total facility energy divided by IT energy — that let an operator see, for the first time, how many watts were burned on cooling, conversion, and lighting for every watt that reached a server. It drove a fifteen-year efficiency campaign that took hyperscale halls from a PUE near 2.0 to design points around 1.1. But PUE was designed for an air-cooled, conversion-heavy, fixed-IT world, and the AI data center is none of those things. The denominator it trusts — "IT energy" — is no longer a stable line. Liquid cooling moves fans and pumps across it. Densification makes it swing with utilization. And the questions that now gate a project — how much water, how much carbon, how much useful work — live entirely outside its formula.
This chapter is the canonical home for the efficiency metric stack referenced from Chapter 0.3 and Chapter 5.1. Each metric is a choice of what to make visible, and every choice hides something. We define PUE precisely and show exactly where it breaks for AI; separate WUE-site from WUE-source and the embedded-water trap between them; walk ERF/ERE, REF, and CUE as the carbon-and-reuse layer; introduce the work-based metrics (TUE/ITUE, DCeP, tokens-per-joule) that put real output in the denominator; and close on how to build a measurement plan that survives an audit instead of producing a vanity number for a press release.
PUE, precisely — and why it breaks for AI
PUE is total facility energy divided by IT equipment energy, measured over a full year (an instantaneous or design-point PUE is a different, weaker claim — always ask which one a vendor is quoting). A PUE of 1.20 means that for every 1.0 kWh delivered to IT, another 0.20 kWh went to everything else: cooling plant, UPS and transformer losses, switchgear, lighting. The industry's weighted-average annualized PUE has sat at roughly 1.54 for six consecutive years (Uptime Institute Global Survey 2025) — not because nobody is improving, but because legacy stock, hot-climate sites, and part-load operation drag the global mean while new builds design to 1.1–1.3. The headline number has stalled even as the frontier has moved.
Three structural problems make PUE actively misleading for AI facilities, and each is a fork an operator must decide how to handle.
1. The liquid-cooling boundary problem. PUE's denominator is "power that crosses into the IT." In an air-cooled hall, server fans sit inside that boundary and CRAH fans sit outside it — clean. Direct-to-chip liquid blurs the line: cold plates, in-rack manifolds, and the pumps that move coolant can be metered as IT or as facility depending on where the CDU sits and how the rack is wired. Move a fan's power from the facility side to the IT side and PUE improves while nothing physical got more efficient — you simply re-labeled a loss. Worse, a genuinely efficient warm-water loop with a big chiller plant can post a higher PUE than a cheaper, hotter air hall, because the liquid plant's parasitic load is honestly accounted while the air hall's server-fan burn hides inside IT. PUE was never meant to compare cooling architectures, and using it to do so is the single most common metric error in 2026 procurement.
2. The utilization problem. PUE says nothing about whether the IT is doing useful work. A cluster idling at 15% utilization can post a gorgeous PUE — the cooling and power overhead scales sub-linearly, so the ratio looks great precisely because the expensive part is barely loaded. An operator optimizing PUE in isolation is rewarded for under-utilization, which is the opposite of what the capital wants. This is why goodput-aware operators (see Chapter 12.2) refuse to manage on PUE alone.
3. The everything-else-is-invisible problem. PUE is an energy-overhead ratio. It is silent on water (a 1.10-PUE evaporative hall can drink millions of liters a day), silent on carbon (1.10 PUE on coal is dirtier than 1.4 PUE on hydro), and silent on whether heat is wasted or reused. The metrics below exist precisely to make those silences audible.
WUE: site versus source, and the embedded-water trap
Water Usage Effectiveness is liters of water consumed per kWh of IT energy. It is the metric that has moved fastest from obscure to board-level, because water is now a siting gate (Chapter 3.7) and a social-license risk, not just an operating cost. The fork that matters is WUE-site versus WUE-source, and operators routinely report the flattering one.
WUE-site counts only the water consumed on the property — evaporative tower make-up, adiabatic pre-cooling, humidification. Industry averages run ~1.8–1.9 L/kWh for evaporative designs; best-in-class is 0.3–0.7 L/kWh; closed-loop and air-cooled designs approach zero. Microsoft's reported FY2025 fleet WUE-site is ~0.30 L/kWh, with next-generation closed-loop designs at essentially zero. WUE-site is the number that lets an arid, power-rich site become viable by designing water out — and it is the number that lets a thirsty evaporative hall look responsible if you stop counting at the fence.
WUE-source adds the water embedded in the electricity the facility consumes — the water evaporated at the power plant to make the megawatts. This is where the trap springs. The US-average energy water-intensity factor (EWIF) is roughly 5.1 L/kWh, and ranges from near zero (wind, solar, gas combined-cycle on dry cooling) to ~85 L/kWh (some thermoelectric plants on once-through or evaporative cooling). On a typical grid, far more water is evaporated to generate a data center's power (~7.6 L/kWh embedded) than to cool its chips (~1.8 L/kWh on-site). A closed-loop hall with WUE-site near zero can still carry a large WUE-source if it draws coal- or nuclear-heavy thermoelectric power. The decision: a site that goes water-free on cooling but sits on a thirsty thermal grid has not eliminated its water footprint — it has relocated it upstream and out of its own report. Disclose both, or disclose nothing honest.
| Metric | Formula (denominator) | Makes visible | Boundary choice | Gamed by | Standard |
|---|---|---|---|---|---|
| PUE | Total facility / IT energy | Energy overhead (cooling + conversion) | IT-meter line | Moving losses inside IT; quoting design not annualized; reporting at low load | ISO/IEC 30134-2 |
| TUE / ITUE | Compute energy (CPU/GPU/mem) — TUE = ITUE x PUE | Losses inside the IT (fans, PSU, VRM) | The chip, not the chassis | Hard to game — but rarely reported; needs IT-internal metering | LBNL / EE-HPC WG |
| WUE-site | On-site water / IT energy (L/kWh) | On-property water (tower make-up) | Property fence | Counting only on-site; ignoring embedded grid water | ISO/IEC 30134-9 |
| WUE-source | On-site + embedded grid water / IT energy | Total water incl. power-gen evaporation | Generation source | Choosing a low-EWIF grid on paper; market-based hand-waving | Green Grid WP#35 |
| ERF / ERE | Reused energy / total (ERF); ERE = (1 - ERF) x PUE | Waste heat actually reused off-site | Reuse offtake meter | Claiming reuse with no real offtaker; counting intent not delivery | ISO/IEC 30134-6 |
| REF | Renewable energy / total energy | Share of consumption from renewables | Energy-attribute boundary | Unbundled RECs / annual matching vs 24/7 CFE | ISO/IEC 30134-3 |
| CUE | Total operational CO2e / IT energy = CEF × PUE (kgCO2e/kWh) | Carbon intensity of the work | Location- vs market-based | Market-based accounting that ignores the actual grid burned | ISO/IEC 30134-8 |
| DCeP | Useful work / total facility energy | Whether the energy bought anything | Operator-defined 'useful work' | Defining 'work' to flatter the result; not comparable across operators | Green Grid |
Each metric covers another's silence. PUE is silent on water; WUE fills that gap but is silent on carbon; CUE fills that but is silent on whether the energy did anything; DCeP fills that but is operator-defined and not comparable. No single metric is sufficient, and any single metric reported alone is an invitation to game the boundary. The honest move is to report a stack — PUE + WUE(site and source) + ERF + REF + CUE + a work-based denominator — so that improving one at the expense of another shows up immediately. An operator who lowers PUE by raising water use, or lowers WUE-site by ignoring WUE-source, is caught the moment both lines sit on the same page.
The reuse-and-carbon layer: ERF/ERE, REF, CUE
ERF (Energy Reuse Factor) is the share of a facility's energy that is beneficially reused outside its boundary — almost always waste heat exported to a district-heating loop or industrial offtaker (engineered in Chapter 5.9, monetized and regulated in Chapter 15.5). Its sibling ERE (Energy Reuse Effectiveness) folds reuse into the PUE frame: ERE = (Total − Reused) / IT, equivalently ERE = (1 − ERF) × PUE. ERE can legitimately drop below 1.0 — the only efficiency number in the building that can — because exported heat is credited back. The consequence to watch: ERF is the metric most prone to intent inflation. A signed MOU with a municipality is not delivered heat; ISO/IEC 30134-6 and EN 50600-4-6 require measured offtake at the boundary, not a press release. In Germany, ERF is no longer optional: the Energy Efficiency Act (EnEfG) mandates a minimum 10% ERF from July 2026, 15% from 2027, 20% from 2028 for new facilities above a non-redundant power threshold — a hard regulatory floor, not a sustainability aspiration.
REF (Renewable Energy Factor) is the renewable share of consumption (ISO/IEC 30134-3). The decision buried inside it is the accounting boundary: an unbundled-REC annual-matching REF of 100% can coexist with a facility that runs on a fossil grid every night, while a 24/7 carbon-free-energy (CFE) score tells the truth hour by hour. The gap between annual matching and 24/7 CFE is the entire subject of Chapter 15.3; here the point is only that REF without a temporal qualifier is a number that hides its own assumptions.
CUE (Carbon Usage Effectiveness) is total operational CO2e per unit of IT energy (ISO/IEC 30134-8) — equivalently CUE = CEF × PUE, where CEF is the grid carbon emission factor in kgCO2e/kWh and PUE scales it by facility overhead. It is the metric that finally makes a low-PUE-on-dirty-power facility look as bad as it is. The fork is location-based versus market-based carbon: location-based reflects the physical grid you actually burned; market-based credits contracts and instruments. Both are legitimate under the GHG Protocol, but they answer different questions, and an operator reporting only the flattering one is choosing what to hide. CUE is also silent on embodied carbon — the concrete, steel, and silicon — which is now the dominant lifecycle term and is treated separately in Chapter 15.6.
Work-based metrics: putting output in the denominator
Every metric above is an infrastructure-efficiency ratio: it asks how cleanly you delivered energy to IT, never whether the IT produced anything. For AI, where a cluster's value is tokens generated or models trained, this is the deepest blind spot — and the one that separates a vanity number from an operating discipline.
TUE and ITUE (proposed by Patterson at Intel with the LBNL Energy-Efficient HPC Working Group) push the boundary inward. ITUE is PUE's logic applied inside the server: total IT energy divided by the energy that actually reaches compute silicon (CPU, GPU, memory, storage), separating out server fans, PSU conversion, and VRM losses. TUE = ITUE × PUE is then the only number that honestly compares an air-cooled and a liquid-cooled facility, because it refuses to let the boundary shift hide a loss. Its cost is instrumentation: you need per-server or per-component telemetry, which is why adoption lags PUE by a decade despite TUE being the more correct number.
DCeP (Data Center Energy Productivity) is the boldest move: useful work divided by total facility energy. The Green Grid deliberately left "useful work" operator-defined — searches served, transactions cleared, or for AI, tokens generated per MWh and training-FLOP-hours per MWh. DCeP's strength is that it cannot be gamed by under-utilization: a half-idle cluster posts a terrible DCeP precisely because it produced little. Its weakness is that operator-defined work is not comparable across operators, so DCeP is an internal optimization metric, not a benchmark. For 2026 AI operators the practical instantiation is a tokens-per-joule (or tokens-per-watt, or $-per-million-tokens-at-fixed-carbon) denominator that fuses model efficiency, hardware efficiency, and facility efficiency into one number the business actually cares about. This is the metric that aligns the efficiency team with the goodput team — both are now optimizing useful output per unit of scarce power, which in a power-bound era is the only ratio that matters.
The regulatory floor: you no longer choose whether to measure
Until recently the metric stack was a matter of operator discipline and customer pressure. In the EU it is now law. The recast Energy Efficiency Directive, Article 12, requires every data center with installed IT power demand at or above 500 kW to report annually — by 15 May, covering the prior calendar year — into the European database on data centers. The four mandatory indicators are exactly PUE, WUE, ERF, and REF, with energy and water consumption, floor area, and traffic also disclosed. Germany's EnEfG layers the ERF reuse mandate on top; national implementations across the bloc add their own teeth. The standards backbone underneath is the ISO/IEC 30134 KPI series and the EN 50600 family (now migrating into the ISO/IEC 22237 series), which fix the definitions so a regulator and an operator mean the same thing by "PUE." The disclosure landscape — CSRD/ESRS, ISSB, and the standards stack — is the subject of Chapter 15.7; the consequence for this chapter is simpler: the boundary games above are increasingly auditable, and a number that flattered a press release will not survive a regulator's methodology check.
Deep dive: building a measurement plan that survives an audit (and avoids vanity numbers)
The gap between a credible efficiency program and a vanity number is almost never the headline figure — it is the measurement plan behind it. A defensible plan answers five questions before it publishes a single ratio, and each answer is a decision with a downstream consequence.
1. Where is the boundary? Fix, in writing, the exact meter points for "IT energy," "facility energy," and (for liquid) where the CDU and rack pumps are counted. Re-deriving PUE after a liquid retrofit without re-stating the boundary produces a phantom improvement. Publish the boundary diagram alongside the number.
2. Instantaneous, peak, or annualized? Only annualized PUE/WUE means anything — it captures part-load, seasonal economization, and the bad days. A design-point or best-hour figure is a marketing claim. State the averaging window and the data resolution (sub-hourly is the modern bar).
3. Site or source — and which carbon basis? Commit to reporting WUE-site and WUE-source, and CUE on both location-based and market-based bases. Reporting only the flattering side of each pair is the most common audit failure, because the omission is itself a choice the auditor can see.
4. What is in the denominator? Decide whether efficiency ratios are normalized by nameplate IT, measured IT, or useful work — and disclose utilization alongside PUE so a low ratio earned by idling is visible. Pair every infrastructure metric with at least one work-based metric (tokens-per-MWh or DCeP) so the energy is accountable to output.
5. Who attests, and against which standard? Tie each number to its ISO/IEC 30134 part or EN 50600 clause, name the measurement equipment and calibration cadence, and put a named owner on the attestation. A metric without a standard reference and an owner is a marketing asset, not a measurement. This is the difference between a number that helps you operate and a number that exists to be quoted.
Deep dive: why AI inverts the PUE-WUE tradeoff every operator thought they understood
For a decade the efficiency playbook had a stable tradeoff: evaporative cooling buys you a lower PUE (less compressor work) at the cost of a higher WUE (more water evaporated). Operators slid along that curve to taste. AI densification breaks the curve in two ways, and the break is a decision an operator must now make consciously.
First, at 130 kW per rack and beyond, air is off the table (Chapter 5.1) — the choice is no longer air-versus-evaporative but which liquid architecture. A warm-water, closed-loop direct-to-chip design can deliver both low PUE and near-zero WUE-site by rejecting heat with dry coolers, collapsing the old tradeoff — at the cost of a higher PUE in hot climates (dry cooling needs compressor help on the worst days) and more capital. The operator who used to trade PUE against water now trades PUE against capex and climate-resilience.
Second, the embedded-water term dominates. Once WUE-site is engineered toward zero, WUE-source — the water in the megawatts — becomes the whole footprint. At that point the highest-leverage water decision is not the cooling tower at all; it is the grid carbon-and-water intensity of the site, which is a siting and procurement decision (Chapter 3.7, Chapter 15.3), not a mechanical one. AI thus pushes the binding efficiency decision upstream, out of the plant room and into site selection — which is exactly where a power-bound industry should expect its hardest constraints to live.
The decision, stated plainly
The metric you optimize is a statement about what you are willing to make invisible. Optimize PUE alone and you will be rewarded for moving losses inside the IT boundary, for under-utilizing capacity, and for ignoring every liter of water and gram of carbon the formula cannot see. The 2026-current discipline is to refuse the single number and commission against a stack: PUE for energy overhead, TUE when cooling modality is in question, WUE-site and WUE-source together, ERF/ERE for reuse, REF and CUE for carbon (with a temporal and location-vs-market qualifier on each), and a work-based denominator — tokens-per-MWh or DCeP — so the whole apparatus is accountable to output. Each metric covers another's silence; reported together, none can be gamed without the gaming showing up on the same page. In a power-bound era, the only ratio the business ultimately cares about is useful work per unit of scarce energy and water — and that ratio is invisible to PUE by construction.