Chapter 5.2
Air Cooling at the Limit
Air cooling did not die at the density wall — it was pushed to a hard, well-defined economic ceiling around 40–50 kW per rack, and the engineering decision is no longer whether to use it but where it still wins outright and how far you can responsibly push it before the fan-power curve, not the chip, makes the choice for you.
What you'll decide here
- Raised floor versus slab — whether you inherit a plenum you must manage or commit to overhead distribution and containment, and what that costs you in retrofit optionality.
- Which containment scheme (cold-aisle, hot-aisle, chimney) and which heat-transfer plant (CRAC vs CRAH vs in-row vs fan-wall) you standardize on, and the PUE and serviceability consequences of each.
- How warm you dare run supply air under ASHRAE A1–A4 — the free-cooling hours and PUE you buy against the reliability and acoustic margin you spend.
- The honest per-rack ceiling for your hall — the ~40–50 kW where fan power, approach temperature, and recirculation make air uneconomic before it becomes impossible.
- Which loads stay on air permanently (storage, networking, modest-density inference, edge) versus which are merely waiting for the liquid retrofit — because designing those two as one hall is the recurring mistake.
Almost every AI operator is migrating off air, and almost none has left it entirely. A 2026 AI campus that runs GB200 NVL72 racks on direct-to-chip liquid still rejects 10–15% of every rack's heat to air, still cools its entire networking spine and storage tier with moving air, and, if it is honest about its fleet, still runs the majority of its floor area, if not its megawatts, on air-cooled CPU and modest-density inference. The air-cooling chapter is therefore not a historical footnote. It covers the load that liquid does not take, the brownfield you cannot re-pour, and the regime where the cheapest correct answer is still a fan and a coil.
The forks here are mechanical and the consequences are measured in PUE points, fan watts, and stranded floor. We work through the four real decisions: raised floor versus slab (the distribution substrate), containment and the air-handling plant (CRAC vs CRAH vs in-row vs fan-wall), how warm you run the supply air (the ASHRAE envelope as an economic dial), and how far you can actually push a rack before air loses on cost rather than physics. We close on the load that stays on air for good. The density wall itself — the heat-flux physics and the cooling hierarchy — is established in Chapter 5.1; this chapter is what you do on the air side of that wall.
Raised floor vs slab: the substrate decision
The first fork predates the AI era and constrains everything after it: do you distribute cold air under the IT or over it? The raised access floor (perforated tiles over a pressurized plenum, the iconic 600 mm void) was the industry default for thirty years because it hid cabling and let a CRAC unit pressurize the whole room. It is also a recirculation trap and a structural ceiling. A loaded 48U liquid-ready rack runs 1,200–2,300 kg (and a wet NVL72 approaches ~1,360 kg / 3,000 lb before you add the CDU and coolant); raised-floor systems are rated for point and rolling loads that high-density AI racks routinely exceed, which is why the dense AI hall has, quietly, gone back to the slab.
Slab-on-grade with overhead air distribution — supply ducted down from above or pushed by perimeter/in-row units, return through a ceiling plenum — removes the structural limit, removes the under-floor obstruction that wrecks plenum pressure once you fill it with liquid pipe and busway, and is the only sane substrate for racks heavy enough to need liquid. The consequence you accept is that you lose the plenum as a free distribution path: every kilogram of air now has to be moved by a unit you can see, which makes airflow management an active engineering problem rather than a passive room-pressure one. For a greenfield AI hall this is the right trade — you were going to plumb the floor for liquid distribution anyway, and a raised floor full of coolant manifolds is a leak you cannot see. For a brownfield air hall, the existing raised floor is exactly the constraint that caps your retrofit, a thread picked up in Chapter 5.10.
Containment and the air-handling plant
The single highest-return air-side intervention is the cheapest: stop hot and cold air from mixing. Uncontained, a room's supply and return blend, you over-cool to compensate, and your fans and chillers work against your own bypass. Containment — sealing either the cold aisle (a roof and end-doors over the supply aisle) or the hot aisle (capturing the exhaust and ducting it back) — is the difference between a PUE in the 1.5s and one approaching the low 1.2s on the same plant. The fork between the two is operational: cold-aisle containment keeps the working aisle comfortable and the room hot, is cheaper to retrofit, but bakes the rest of the floor (and the people in it) in return air; hot-aisle containment keeps the room at supply temperature and pipes the hot exhaust away, which is better for warm-supply operation and mixed liquid/air halls but costs more and complicates the ceiling. For a hall that will host rear-door heat exchangers or a liquid retrofit, hot-aisle (or chimney) containment is the forward-compatible choice, because it keeps the room cold while the racks run hot — see Chapter 5.3.
The second fork is the plant that actually moves and cools the air, and the choice is fundamentally about how close the cooling sits to the load. The terminology trips people up, so be precise: a CRAC (computer-room air conditioner) has a compressor and refrigerant in the unit — it is a packaged DX air conditioner; a CRAH (computer-room air handler) has only a chilled-water coil and fans, with the refrigeration done remotely by a chiller plant. CRAH is the hyperscale default because chilled water decouples heat rejection from the room and rides the economizer; CRAC survives where there is no chilled-water plant (smaller rooms, edge, some retrofits). As density climbs, perimeter units of either kind lose: the air has too far to travel, recirculation grows, and you spend fan power fighting your own room. In-row coolers sit between the racks and shorten the air path; fan-wall (close-coupled) units replace the wall of perimeter CRAHs with a high-efficiency fan array right at the aisle. The closer the cooling, the higher the density it supports and the lower the fan energy per kilowatt — at the cost of putting water (CRAH/in-row) nearer the IT.
| Plant type | Refrigeration | Proximity to load | Practical rack density | Where it wins | Main cost |
|---|---|---|---|---|---|
| Perimeter CRAC | DX in the unit | Room edge | Up to ~10–15 kW | No chilled-water plant; small/edge rooms | Fan energy + recirculation at density |
| Perimeter CRAH | Chilled water (remote chiller) | Room edge | ~15–25 kW with containment | Hyperscale default; rides economizer | Needs a chilled-water plant + plumbing |
| In-row cooler | Chilled water or DX | Between racks | ~30–50 kW (active, contained) | High-density air pockets; mixed halls | Water in the white space; tile/floor cost |
| Fan-wall / close-coupled | Chilled water | At the aisle wall | ~40–50 kW (contained) | Dense air at the limit; low fan energy | Capex + dedicating wall space to cooling |
Airflow management: the unglamorous half of the win
Containment and plant selection set the ceiling; airflow management decides how much of that ceiling you actually reach. The failure modes are mundane and they compound: bypass (cold air that returns to the cooler without doing work — leaking tiles, gaps under racks, mis-placed perforated tiles in a hot aisle), recirculation (hot exhaust looping back into the inlet over the top or around the ends of a rack), and open U-space (a missing blanking panel that lets the two streams short-circuit straight through the rack). Each one forces you to drop supply temperature to protect the worst inlet, which spends chiller energy and erases the free-cooling hours you were trying to bank. Blanking panels, brush grommets, floor-leakage sealing, and correct tile placement are the cheapest PUE money in the building, and the first thing an audit finds missing.
The metric that captures this is the approach temperature — the gap between what the cooler supplies and what the chip's inlet actually sees. A well-managed contained aisle holds that gap to a few degrees; a leaky uncontained room can lose 5–10 °C to mixing, which is 5–10 °C you have to claw back at the chiller. The engineering point is that air cooling's ceiling is not only set by the coil — it is set by how cleanly you can deliver the cold air you already made. The thermal-metric vocabulary (approach, NTU/effectiveness) is established in Chapter 5.1; live-floor instrumentation of inlet temperatures, ΔT, and bypass belongs to operations in Chapter 14.2.
The ASHRAE envelope as an economic dial
Here is the decision that most operators under-exploit. ASHRAE TC 9.9's thermal guidelines define a recommended band (18–27 °C inlet, all classes) and progressively wider allowable envelopes: A1 15–32 °C, A2 10–35 °C, A3 5–40 °C, A4 5–45 °C. Running warmer supply air is not a compromise you tolerate — it is a lever you pull, because every degree you raise the supply (and therefore the chilled-water and condenser temperatures) buys more hours of free / economizer cooling and lowers compressor energy. A hall designed and operated to A2–A3 in a temperate climate can run hundreds to thousands more economizer hours per year than one pinned to the conservative recommended band, dragging mechanical PUE down materially.
The consequence you pay for warmth comes in three currencies. First, server fan power: above roughly 25–27 °C inlet, on-board fans ramp non-linearly, and past a point the rack's own fans eat the chiller savings — the warm-air optimum is a curve with a minimum, not a monotonic win. Second, reliability margin: warmer silicon means a higher failure rate and less thermal headroom for a cooling excursion, which matters more for synchronous training (a thermal trip that throttles or drops a node restarts the job) than for stateless inference. Third, acoustics and the human floor: ramped fans are loud, and a hot-aisle in an A3 hall is a workplace-safety question. The right answer is climate- and workload-specific: a cool-climate inference hall should run as warm as the fan-power curve allows; a tropical training hall has less free-cooling headroom to harvest and tighter reliability stakes. The downstream effect of supply temperature on plant selection and economizer mode lives in Chapter 5.8; how PUE and ITUE are defined and reported is canonical in Chapter 15.1.
Pushing air to 40–50 kW: where physics ends and economics begins
The number that matters for the scoping decision is the honest per-rack ceiling, and it has two values. The often-quoted ~41 kW figure (ASHRAE TC 9.9 / SemiAnalysis) is the practical air ceiling for a conventional contained hall — beyond it, a perimeter-cooled raised-floor room cannot deliver enough cold air to enough inlets without unacceptable recirculation. With aggressive close-coupling — fan-wall or active in-row units right at the aisle, full containment, and disciplined airflow management — vendors and operators credibly run air-cooled racks to ~50 kW, and in narrow cases higher. But the ceiling that decides projects is economic, not physical: somewhere in the 40–50 kW band the fan energy per kilowatt removed climbs steeply enough that the marginal kilowatt of cooling costs more in fan power and floor space than the same kilowatt would cost on a rear-door heat exchanger or a direct-to-chip cold plate. Air does not become impossible at 50 kW; it becomes the wrong answer.
This is why the fork at the density wall is not "air or liquid" as a binary but a band: below ~25–30 kW air wins on cost and simplicity in nearly every climate; from ~30–50 kW you are in the close-coupled-air-versus-rear-door contest, where the deciding factors are whether you have facility water at the rack and whether the density is going to keep climbing; above ~50 kW air has lost and the only questions are which liquid path and how much residual air load remains. A GB200 NVL72 at ~132 kW is not a close call — it is ~90 kW past the ceiling, which is why Chapter 5.4 treats direct-to-chip liquid as the 2026 default rather than an option. The mistake is treating the 40–50 kW band as a place to live rather than a place you pass through on a density ramp; a hall designed to sit at 45 kW air is one generation refresh from being stranded.
| Rack density | Air-cooling status | Right answer | Why the fork falls here |
|---|---|---|---|
| ≤ ~15 kW | Comfortable | Perimeter CRAH/CRAC + containment | Legacy density; air is cheapest and simplest |
| ~15–30 kW | Air still wins | Contained air, often in-row assist | Modest-density inference, storage, networking sit here |
| ~30–50 kW | Air at the limit | Close-coupled air OR rear-door HX | Fan-power curve steepens; depends on water availability + density trajectory |
| ~50–100 kW | Air has lost | Rear-door HX / air-assisted liquid | Bridge tech; no facility water needed at the rack (→ 5.3) |
| > ~100 kW | Not a question | Direct-to-chip liquid | GB200 NVL72 ~132 kW; ~90 kW past the air ceiling (→ 5.4) |
When air still wins outright
The most important air-cooling decision in 2026 is recognizing the loads that should never leave it. Air does not win these by inertia — it wins them on the merits, because the density never crosses the band where liquid pays.
- Networking and the storage tier. Spine and leaf switches, the front-end Ethernet, and the storage fleet draw a fraction of a GPU rack's power and sit comfortably under 15 kW. Plumbing them for liquid is pure cost with no thermal benefit; they are the permanent air load even inside an all-liquid GPU hall. This is most of the "30" in the 70/30 split.
- Modest-density inference. An HGX B200-class inference node (~35–50 kW, air or liquid) and most CPU-served and edge-model inference live inside the air band. For a latency-first, geo-distributed inference business — many small sites near users — air keeps capex and complexity down and avoids dragging a facility-water loop to every metro. The archetype logic is laid out in Chapter 1.1.
- Edge and micro-sites. Telco/MEC nodes, Tier-2 metro colos, and on-prem appliances are power-, space-, and operations-constrained, run lights-out, and cannot justify a CDU and a secondary loop for a few tens of kilowatts. Sealed/modular air, ambient-limited, is the correct and often the only answer.
- Batch and curtailable workloads on existing air halls. Throughput-bound, interruption-tolerant work has no reason to demand a liquid retrofit; it is the natural tenant for the air-cooled stock you already own.
The recurring mistake is the inverse: designing one hall for both the residual air load and a density ramp it cannot absorb, then discovering at the first GPU refresh that the room has the wrong floor, the wrong plant, and no water. Decide which racks are permanently on air and which are merely not-yet-liquid, and design them as two thermal zones — because they are.
Deep dive: the residual air load inside an all-liquid rack (why air never fully leaves)
Even the densest direct-to-chip rack is a hybrid. A GB200 NVL72 rejects roughly 115 kW to its liquid cold plates and ~17 kW to air — about 13% of the rack's heat that the cold plates do not touch. The air-cooled residue is not an afterthought; it is the GPUs' and CPUs' immediate neighbors: the DIMMs (memory modules sit outside the cold-plate footprint), the NICs and optics (the network adapters and pluggable transceivers run hot and stay on air), the PSUs / power shelves, the NVSwitch and management gear, and the voltage regulators. Every one of those still needs a moving air stream and a path to a coil — which means even an all-liquid GPU hall has a real, designed-in air-cooling subsystem with its own containment, its own fans, and its own share of the cooling plant.
The engineering consequences are concrete. The CDU and facility-water sizing in Chapter 5.4 must be paired with an air-handling design for the residual ~15% load, or the room cooks the optics and DIMMs while the GPUs sit happily on liquid. Containment in a hybrid rack is harder than in a pure-air or pure-liquid one, because you are managing a small but non-trivial hot-air stream around a rack whose dominant heat path is invisible. And acoustics get worse, not better: the air that does move in a liquid rack moves fast through a small cross-section. The lesson for the air chapter is that air cooling is the technology you keep, shrunk to the load the cold plate cannot reach.
Deep dive: why CRAH beat CRAC for scale, and where DX still belongs
The CRAC-vs-CRAH choice looks like a units question and is actually a heat-rejection-topology question. A CRAC carries its own compressor and condenser circuit, so each unit is an independent refrigeration machine; that is wonderful for a small room with no central plant and terrible at scale, because you now have dozens of compressors to maintain, no shared economizer, and refrigerant management spread across the floor. A CRAH is just a coil and a fan: it hands its heat to a central chilled-water plant, which can run a water-side or air-side economizer and reject heat once, efficiently, for the whole campus. That single shared rejection path — and the ability to push chilled-water temperature up to chase free cooling — is why every hyperscale air hall is CRAH-and-chiller, and it is the on-ramp to the warm-water loops that liquid cooling then inherits (→ Chapter 5.8).
DX-based CRAC has not disappeared; it has retreated to where its independence is an asset rather than a liability: edge and micro-sites with no room for a chiller plant, small enterprise rooms, and brownfield retrofits where adding chilled-water infrastructure is uneconomic. There is also a pumped-refrigerant middle ground (and CO₂/economized DX variants) for sites that want compressor-free hours without a water loop. The decision rule: if the campus has — or will have — a central chilled-water plant (and any liquid-cooled hall implies one), standardize on CRAH and let the residual air load ride the same plant the cold plates use; reach for CRAC only where the central plant genuinely does not exist and never will.
Anti-patterns
Three air-side mis-scopes recur, each from reasoning about the room instead of the load:
- Designing to live in the 40–50 kW band. Building a close-coupled, fan-wall hall tuned to sit at ~45 kW air looks clever and is one GPU generation from stranded — the next refresh wants liquid, and the room has the wrong floor and no water. Pass through the band; do not homestead in it.
- One hall for the permanent-air and not-yet-liquid loads. Mixing the storage/networking residual (forever on air) with GPU racks on a density ramp in a single thermal zone guarantees that the zone is wrong for one of them. Design two zones.
- Chasing the widest ASHRAE envelope as a setpoint. Running A3/A4 to bank free-cooling hours, then watching server fans and acoustics eat the savings and reliability margin past the fan-power knee. A4 is a survivability envelope for a cooling excursion, not a steady-state target.