The Definitive Guide toAI Data Centers
Ask the Guide
Guide Sustainability & Efficiency15.8

Chapter 15.8

Grid Impact, Energy-Systems Integration & Grid Services

A gigawatt of AI load can be treated as an unconditional firm draw that waits years in the interconnection queue, or as a flexible, dispatchable resource that the grid will energize in months and pay you to operate — and which posture you choose is an engineering decision made at scoping time, not a contract negotiated after the slab is poured.

POWER-BOUNDGOODPUT

What you'll decide here

  1. Whether you interconnect as an unconditional firm load (maximum reliability, maximum queue wait, maximum stranded-megawatt risk) or as a flexible/curtailable load that trades a fraction of uptime for years of speed-to-power — the single fork that decides whether your campus energizes on the demand-growth timeline or the generation-build timeline.
  2. Whether your behind-the-meter or co-located generation is a grid-bypass strategy (islanded, no export, minimal grid interaction) or a grid-integration strategy (synchronized, export-capable, providing services back) — a choice the December 2025 FERC PJM order just re-priced for every operator in the largest US market.
  3. How much flexibility you can credibly commit — magnitude, duration, frequency, notice — because that envelope, not nameplate megawatts, is what utilities now study, price, and queue you against (EPRI Flex MOSAIC).
  4. Which flexibility lever you actually pull when called: shed compute (lose goodput), shift compute (move batch in time/space), ride through on UPS/BESS, or fail over to on-site generation — each with a different cost-per-curtailment-hour and a different blast radius on the workload.
  5. Whether grid services are a financeable revenue stream you underwrite into the pro-forma, or a siting accelerant you give away to jump the queue — because the same flexibility cannot always be sold twice, and the value of speed usually dwarfs the value of the ancillary-service check.

The data center industry's relationship with the grid used to be a single transaction: buy firm power, pay the bill, never think about it again. The facility was a price-taking, always-on, inflexible block of demand, and the grid was sized to serve it. That arrangement broke down around 2024, when the size of a single AI campus crossed the threshold where one customer's interconnection request could move a balancing authority's entire load forecast. A 1 GW campus is no longer a customer the grid serves; it is a system-scale event the grid has to plan around. And once you are large enough to move the system, the system starts asking what you can do for it.

This chapter is the canonical home for flexibility — the property that converts an AI load from a passive liability on the interconnection queue into an active, dispatchable, sometimes revenue-generating grid resource. We trace four forks: the interconnection posture (firm vs flexible), the co-location / behind-the-meter question (bypass vs integration), the grid-services stack (what you can sell, and to whom), and the social-license / carbon dimension that increasingly gates whether a project is permitted at all. The through-line is the flexibility-unlocks-headroom thesis: the cheapest gigawatt is the one already on the grid, reachable not by building new generation but by agreeing to get out of the way for a handful of hours a year. The mechanics of how a facility rides through and supports the point of interconnection live in Chapter 4.10; the queue-and-speed-to-power story lives in Chapter 3.2; the macro load-growth narrative lives in Chapter 16.1. This chapter is where they meet as a strategy.

The master fork: firm load vs flexible load

Every other decision in this chapter descends from one choice made the day you file an interconnection request: do you ask the grid to serve you unconditionally, or do you offer to limit your draw under defined conditions? The unconditional path is the legacy default — firm load, served to the same reliability standard as a hospital, planned against the system peak. Its consequence is the interconnection wall: in the densest US hubs, large-load energization waits four to seven years and sometimes longer, because firm load forces the utility to plan new generation and transmission before it can say yes. You are queued behind the speed of steel and copper.

The flexible path inverts the logic. If you agree to curtail — to shed or shift load for a bounded number of hours when the system is tight — the utility no longer has to plan generation for your peak. It can interconnect you against the headroom that already exists between average and peak system conditions. The empirical size of that headroom is the single most important number in this chapter: a Duke University Nicholas Institute study found the existing US grid could absorb ~98 GW of new load at an average curtailment of just 0.5% of hours — under ~44 hours per year — rising to ~126 GW at 1.0% (Duke Nicholas Institute, 2025). That is roughly the entire announced AI build-out, available with no new power plants, in exchange for occasionally turning down. The headroom is not evenly distributed: PJM alone holds ~18 GW of it at 0.5%, MISO ~15 GW, ERCOT and SPP ~10 GW each, Southern Company ~8 GW.

The consequence of choosing flexible is a trade you must be able to price: you exchange a small, bounded loss of uptime for a potentially multi-year gain in speed-to-power. At ~$10–12B of revenue per GW of AI capacity per year (SemiAnalysis, 2025), energizing a campus even six months early is worth billions — a sum that dwarfs the cost of the curtailment itself, provided the workload mix can actually absorb a few dozen interruption-hours a year without breaching its SLAs. That proviso is a workload question that this chapter forces back to scoping. → archetype-level interruption tolerance in Chapter 1.1.

What flexibility actually means: the four levers

"Flexible load" is not one thing. When the grid calls, a data center has four distinct levers, and they differ enormously in cost, speed, and blast radius. Choosing which lever answers a given call is the operational core of grid-interactive operation — and getting the assignment wrong either breaks an SLA or leaves money and headroom on the table.

Shed (curtail compute). Pause or kill interruptible work — batch inference, evaluation sweeps, fine-tuning, internal jobs — and let the power draw fall. This is the purest form of demand response and the cheapest per hour if the displaced work is genuinely deferrable. Its cost is lost goodput: every shed hour is compute you did not sell. Google's fleet is the canonical proof point — it has signed ~1 GW of demand-response capacity across utilities (Indiana Michigan Power, TVA, Entergy Arkansas, Minnesota Power, DTE) by committing to shift or pause machine-learning workloads when grids are stressed (Google, 2025).

Shift (move compute in time or space). Rather than destroying the work, relocate it — defer batch jobs to off-peak hours, or migrate flexible workloads to a sister site in a region that is not constrained. This preserves goodput entirely; the only cost is latency-to-completion and the inter-site bandwidth to move state. It is the most valuable lever and the hardest to engineer, because it requires a scheduler that is power-aware and carbon-aware across a fleet. → power-aware orchestration in Chapter 16.3.

Ride through (UPS / BESS discharge). Hold compute draw constant and supply the curtailed grid energy from on-site storage — the UPS battery or a dedicated grid-scale BESS — for the duration of the event. The workload never notices. The cost is the storage capex and round-trip losses, and the constraint is duration: a ride-through UPS sized for seconds-to-minutes of outage cannot cover a multi-hour demand-response window without being grossly over-sized or paired with a true BESS. This is where the UPS stops being a reliability device and becomes a grid asset. → Chapter 4.5.

Fail over (on-site generation). Hold compute constant and replace grid energy with behind-the-meter generation — gas turbines, reciprocating engines, fuel cells, or in the most ambitious deals, dedicated nuclear. This converts a grid-curtailment event into a generation-dispatch event, at the cost of fuel, emissions, and the capex of a power plant you mostly do not run. It is the lever that blurs into the co-location question below.

The flexibility levers: cost, speed, and what breaks
LeverWhat movesResponse speedDuration it coversCost per curtailment-hourGoodput impact
Shed computeInterruptible jobs pause/dieSeconds (scheduler-driven)Hours to daysLost revenue on displaced workDirect loss — work is destroyed
Shift computeBatch moves in time or to another siteSeconds–minutesHours (deferral) to indefinite (geo-shift)Inter-site bandwidth + latency-to-completionPreserved — work still runs
Ride through (UPS/BESS)Energy source, not the loadMilliseconds (transfer)Seconds–minutes (UPS); hours (true BESS)Storage capex + round-trip loss (~10–15%)None — workload unaffected
Fail over (on-site gen)Energy source, not the loadSeconds–minutes (spin-up)Hours to indefinite (fuel-limited)Fuel + O&M + emissions + plant capexNone — workload unaffected
Per-event cost and notice are practitioner ranges; the right lever for a given call depends on the workload mix and the depth/duration of the curtailment requested. Goodput = useful compute delivered; see Chapter 12.2.

The four levers form a stack: a well-engineered facility uses the cheapest lever that satisfies the call and reserves the expensive ones for deep, rare events. A shallow, frequent curtailment is answered by shedding batch; a deep, multi-hour event is answered by BESS ride-through or on-site fail-over so the revenue-bearing inference fleet never blinks. The mistake operators make is committing a single lever to the utility — "we will shed 100 MW" — when the value comes from committing an envelope the facility can satisfy with whichever lever is cheapest at the moment of the call.

Co-location and behind-the-meter: bypass vs integration

The second fork is about generation. When the grid cannot energize you fast enough, you bring your own power — but the way you connect that power to the grid is itself a strategic choice with very different consequences. Behind-the-meter (BTM) bypass islands the generation: a gas plant or fuel-cell farm feeds the data center directly, the campus draws little or nothing from the grid, and there is no export and minimal grid interaction. This is the speed-to-power play — announced BTM gas capacity reached ~82–101 GW cumulatively by 2026 (Cleanview / SemiAnalysis), almost all of it islanded — and its appeal is that it sidesteps the interconnection queue entirely. Its cost is that you have built and must operate a merchant power plant, with the emissions, fuel exposure, and O&M that implies, and you forgo the reliability backstop of a grid connection.

Co-located integration is the opposite posture: the generation and the load both connect to the grid, synchronized and metered, so the campus can draw from the grid when that is cheaper, export when generation exceeds load, and provide services in both directions. This is harder to permit and slower to stand up, but it turns the on-site plant into a grid asset rather than a grid bypass — and it preserves the option to sell capacity and energy back. The choice between bypass and integration was, until recently, left to bilateral negotiation and a patchwork of utility tariffs. In December 2025 that changed for the largest US market.

Co-location / BTM posture → consequences
PostureGrid interactionSpeed-to-powerGrid-services revenueKey risk
BTM bypass (islanded)None / minimal — no exportFastest — skips the queueNone (you are invisible to the market)Stranded plant; emissions; no grid backstop
Co-located, non-firmSynchronized; limits withdrawals on demandFast — flexibility unlocks headroomDemand response + curtailment creditsCurtailment events hit goodput; SLA exposure
Co-located, firm + exportFull two-way; sells energy & capacitySlowest — full interconnection studyEnergy + capacity + ancillary stackingHighest capex; merchant power exposure
Posture is a spectrum, not three discrete bins; most large 2026 deals are hybrids. FERC PJM order (Dec 2025) reshapes the integration column for the largest US market.

The grid-services stack: what you can actually sell

Once a facility is flexible and grid-synchronized, it can monetize that flexibility through a layered stack of grid products. The stack matters because the products differ in value, in the response speed they demand, and crucially in whether they can be combined ("stacked") on the same megawatt or whether selling one forecloses another. The naive view is that grid services are a meaningful new revenue line for an AI campus; the disciplined view is that for most operators they are a siting accelerant first and a revenue stream a distant second — the value is in the headroom they unlock, not the ancillary-service check they cash.

Demand response and interruptible-load credits sit at the base: you are paid (or given a tariff discount) for committing to curtail when called. This is the most accessible product and the one most directly tied to the speed-to-power benefit. Capacity payments reward you for being available to curtail during the system's tightest hours — in PJM's capacity market, a load that can reliably shed during the annual peak earns a capacity credit much as a power plant does. Ancillary services — frequency regulation, spinning/non-spinning reserves, voltage support — are the fastest and most technically demanding products; they require sub-second response and are most naturally provided by a BESS or by grid-interactive UPS, not by shedding compute. The mechanics of providing frequency response and reactive/voltage support toward the point of interconnection are engineered in Chapter 4.10; this chapter is where you decide whether to.

The stacking question is where most pro-formas go wrong. The same 100 MW of UPS/BESS cannot simultaneously be held in reserve for facility ride-through, committed to a capacity-market curtailment, and bid into the frequency-regulation market at full depth — the commitments overlap and the market operator will not let you sell the same megawatt-hour twice. Real flexibility revenue comes from carefully partitioning the resource: a slice for reliability, a slice for capacity, a slice for fast ancillary services, sized so that a worst-case simultaneous call is still survivable. Over-promise the stack and a single coincident event forces you to default on a grid commitment, which is far more expensive than the revenue was ever worth.

Deep dive: UPS and BESS as a grid resource — the device that pays for itself twice (if you let it)

Every large data center already owns a fleet of batteries sized to ride through the gap between a grid outage and generator start. Historically those batteries sat at full charge doing nothing 99.9% of the time — pure insurance. The grid-interactive thesis is that this idle asset can earn its keep by participating in the market when it is not needed for ride-through. EPRI's DCFlex initiative — a consortium of 65+ utilities, system operators, hyperscalers, and vendors, with five-to-ten flexibility "hubs" demonstrating from H1 2025 through 2027 — explicitly names power assets (UPS/BESS), compute assets, and balance-of-plant as the three sources of data-center flexibility, and treats the conversion of backup power into a grid resource as a primary objective.

The engineering catch is that a UPS optimized for ride-through is the wrong device for grid service. Ride-through wants high power for seconds-to-minutes; grid services (capacity, energy arbitrage, multi-hour demand response) want high energy for hours. Lithium-ion UPS strings can do short bursts of fast frequency regulation without compromising their reliability mission, but covering a four-hour demand-response window means installing a true grid-scale BESS alongside the UPS — a separate, larger, longer-duration asset. The decision is whether to (a) lightly monetize the existing UPS with fast ancillary services that do not deplete its ride-through reserve, or (b) invest in a dedicated BESS that can both deepen ride-through and play the multi-hour markets. Most 2026 operators are doing (a) as a no-regret move and piloting (b) at flexibility hubs. The reliability discipline is non-negotiable: any grid commitment must be subordinate to the ride-through reserve, with a hard partition the market dispatch cannot cross. → storage engineering in Chapter 4.5; grid-interactive control in Chapter 4.10.

~98 GW
new load the existing US grid can absorb at 0.5% annual curtailment (~44 hr/yr); ~126 GW at 1.0%
2025Duke Nicholas Institute, 'Rethinking Load Growth'
18 / 15 / 10 GW
curtailment-enabled headroom at 0.5% in PJM / MISO / ERCOT (SPP also ~10 GW; Southern ~8 GW)
2025Duke Nicholas Institute (via Utility Dive)
~1 GW
data-center demand-response capacity Google has contracted across US utilities (workload shift/shed)
2025Google, 'Infrastructure & Cloud' blog
Dec 18, 2025
FERC PJM co-location order: 3 new transmission services; BTM transition to Dec 18, 2028
2025FERC; Baker Botts / Utility Dive synthesis
~82–101 GW
behind-the-meter gas announced cumulatively (mostly islanded bypass; ~7 GW under construction)
2026Cleanview / SemiAnalysis
65+
utilities, operators, hyperscalers & vendors in EPRI DCFlex; 5–10 flexibility hubs, demos H1 2025→2027
2025EPRI DCFlex / Flex MOSAIC
~$10–12B
revenue per GW of AI capacity per year — the prize that makes speed-to-power dominate ancillary revenue (contested — single-source)
2025SemiAnalysis (onsite gas economics)
4–7+ yr
large-load interconnection wait for firm load in dense US hubs — the cost flexibility avoids
2025ERCOT / PJM filings synthesis

Powering strategy and its grid/carbon footprint

The flexibility posture and the powering strategy are entangled with carbon, and the entanglement cuts both ways. A BTM-gas bypass that energizes a gigawatt two years early also adds a gigawatt of fossil generation that may run far more than the operator's clean-energy commitments imply — the speed-to-power that looks brilliant on the schedule can be a liability on the carbon ledger and in the permit hearing. Conversely, a grid-integrated flexible load is one of the most powerful tools for decarbonizing: a campus that can shift batch compute to hours of high renewable output, or curtail during fossil-heavy peaks, directly improves the carbon intensity of the energy it consumes. Flexibility is simultaneously a grid-access strategy and a carbon strategy, and the best designs exploit both. The 24/7 carbon-free-energy framing and the procurement instruments that make this real live in Chapter 15.3.

The carbon and grid footprints also feed directly into the disclosure obligations that an operator now faces — emissions reporting, grid-impact assessments, and the load-flexibility commitments themselves are increasingly things you must report, not just things you do. → reporting frameworks in Chapter 15.7.

Community relations and social license

The final dimension is the one engineers most often underestimate and that most often kills a project: social license. A gigawatt campus is not just a grid event; it is a community event. It raises questions about who pays for the transmission upgrades (ratepayers or the data center), whether residential electricity bills rise to subsidize a hyperscaler's load, how much water the cooling plant draws from a stressed aquifer, and whether the jobs and tax base justify the strain. In 2025–2026 these questions hardened into organized opposition and into law: zoning reforms ending by-right data-center development, state large-load statutes mandating curtailability and cost-causation, and ballot-box fights over new substations.

Flexibility is, perhaps surprisingly, one of the strongest social-license arguments an operator has. A flexible load that explicitly agrees to curtail during system peaks is one that does not force new generation onto the rate base, does not raise residential bills to serve its peak, and can credibly claim to be a grid asset rather than a grid burden. "We will get out of the way ~44 hours a year so the lights stay on and your bill stays flat" is a far better story at a public hearing than "we need a gigawatt of new firm power and someone has to build it." The decision to be flexible, made for speed-to-power reasons, pays a second dividend in the permit hearing — which is why the most sophisticated operators are leading with flexibility commitments in their community engagement, not hiding them in the tariff. The cost-causation and ratepayer-equity mechanics that underlie these fights are detailed in Chapter 3.2.

Deep dive: the flexibility-unlocks-headroom thesis, and where it breaks

The headline of this chapter — ~100 GW of interconnection headroom available at ~0.5% curtailment — is genuinely transformative, but it carries three caveats that separate the thesis from a guaranteed outcome. First, the headroom is locational: the Duke study's 98 GW is concentrated in specific balancing authorities (PJM ~18 GW, MISO ~15 GW, ERCOT/SPP ~10 GW each), and a campus sited in a pocket with no local headroom gets none of it regardless of how flexible it is willing to be. Flexibility unlocks headroom only where headroom exists. Second, the result assumes the curtailment is actually deliverable — that the load can be turned down on the notice the system needs, to the depth it needs, for the duration it needs. A facility that promised 0.5% curtailability but whose workload mix is 90% SLA-bound real-time inference cannot deliver, and a non-deliverable commitment is worse than none. Third, the headroom is a shared, depleting resource: as more loads claim it, the easy curtailment-enabled headroom fills, the curtailment rate required to fit the next load rises, and the early movers capture a one-time advantage the late movers cannot.

The engineering implication is that the flexibility thesis rewards operators who do three things at scoping time: (1) site into headroom, treating curtailment-enabled headroom as a first-class siting criterion alongside cost and latency; (2) design a workload mix with genuine deferrable slack — enough batch, training, and internal compute that a few dozen curtailment-hours a year cost goodput rather than SLA breaches; and (3) move early, because the headroom is a window that the build-out is racing to close. The grid-interactive engineering that makes a curtailment commitment physically deliverable — ramp rates, ride-through, telemetry to the system operator — is the subject of Chapter 4.10; the financing-downside framing of betting a project on curtailment terms that may tighten is in Chapter 1.8 and Chapter 3.2.

This chapter is the canonical home for flexibility, but it is one node in a web. The interconnection queue and speed-to-power mechanics it accelerates are in Chapter 3.2; the reordered siting hierarchy that should now weight curtailment-enabled headroom is in Chapter 3.1. The physical engineering that makes a flexibility commitment deliverable — ride-through, frequency response, reactive/voltage support toward the POI — is in Chapter 4.10, with the UPS/BESS that backs it in Chapter 4.5 and the LV distribution in Chapter 4.6. The goodput-vs-availability lens that prices a curtailment-hour is in Chapter 12.2; power-aware orchestration that executes the shift lever is in Chapter 16.3. The carbon and 24/7-CFE consequences of the powering strategy are in Chapter 15.3, the disclosure obligations in Chapter 15.7, and the macro power-bound narrative this all serves in Chapter 16.1. The financing downside of underwriting a project on flexibility terms is in Chapter 1.8.