The Definitive Guide toAI Data Centers
Ask the Guide

Chapter 4.2

Utility Interconnect, On-Site Substation & MV Distribution

The on-site substation and MV distribution architecture is where the grid's megawatts become your campus's megawatts — and the switchgear topology, switchgear gas, and protection scheme you freeze here decide whether a single fault drops a pod, a hall, or the whole gigawatt factory.

POWER-BOUNDDENSITY-RAMP

What you'll decide here

  1. The MV distribution architecture — radial vs primary/secondary-selective vs ring — which sets your single-fault blast radius, your concurrent-maintainability story, and your copper-and-switchgear bill against a load that does not tolerate the same outage twice.
  2. GIS vs AIS switchgear, and the now-forced SF6-vs-SF6-free gas decision — a footprint, lead-time, and regulatory-compliance fork that the 2026 EU F-Gas ban has made irreversible for new ≤24 kV equipment in Europe.
  3. The standards lineage — IEC vs ANSI switchgear and protection, IEC 61850 station bus vs hardwired — plus the regional grid code (ENTSO-E RfG, GB Grid Code, NERC) you must ride through, because the wrong lineage re-engineers the relay panel and the FAT.
  4. Pod/block transformer granularity and MVA sizing — the unit of capacity you energize, fault, and maintain — and whether you provision the MV substrate for a density ramp you have not yet committed to.
  5. The protection & coordination study as a hard deliverable — short-circuit duty, IEEE 242 TCC coordination, IEEE 1584/NFPA 70E arc-flash, and the inverter-dominated fault-current problem that breaks conventional overcurrent grading when you island on BTM generation.

Chapter 4.1 ended at the point of interconnection: you have a queue slot, a voltage class, and a contracted megawatt number. This chapter is what happens to those electrons inside the fence — from the high-voltage yard, down through the on-site step-down transformers, into the medium-voltage switchgear lineup, and out across the campus to the pod and block transformers that feed each hall. It is the most physical, most code-bound, and most quietly consequential layer of the power chain. The white space gets the attention; the substation gets the schedule risk, the single-points-of-failure, and the protection study that no lender closes without.

The consequences in this layer are unusually unforgiving because they are geometric and regulatory at once. A radial bus saves you a switchgear section and a year of lead time, and costs you the whole downstream hall on a single breaker failure. An SF6 gas-insulated lineup fits a gigawatt yard into a third of the footprint, and — in the EU after 1 January 2026 — is illegal to install new below 24 kV. A relay scheme tuned for stiff utility fault current trips correctly on the grid and refuses to trip when you island on gas turbines whose inverters cannot deliver more than ~1.2x rated current. Each of these is a fork you walk through once, at design, and live with for the 20-to-30-year life of the yard.

From the fence to the rack: where this layer sits

The on-site electrical estate of a large AI campus is a two-stage step-down with a distribution fabric in the middle. HV transmission (138/230/345/500 kV in ANSI markets; 132/220/400 kV in much of the IEC world) lands in the customer substation, where HV-to-MV power transformers — typically ~80–100 MVA units, sized with a ~10% MVA cushion over MW at 0.95–0.99 power factor, and configured three-for-N+1 to serve ~150 MW of peak — step down to a medium-voltage bus. From that MV bus (commonly 34.5 kV, 33 kV, 22 kV, or 13.8/11 kV depending on region and campus size), MV distribution carries power across the site to pod/block transformers that step to the LV utilization voltage the racks actually consume (415/240 V IEC or 480/277 V ANSI, or increasingly straight to an 800 VDC rail via solid-state transformers — see Chapter 4.4 and Chapter 4.7).

The voltage-class choice is set by campus size and is itself a fork covered in Chapter 4.1: distribution-class service tops out near 20–40 MW; 69–138 kV sub-transmission carries a 100–300 MW campus; 230/345/500 kV transmission is the entry ticket for the gigawatt factories. What this chapter owns is everything between that incoming voltage and the pod transformer: the switchgear that makes and breaks the MV bus, the topology that connects the sources to the loads, the relays that protect it, and the coordination study that proves the whole thing is safe and selective. The HV-yard ownership, NERC registration, and utility-boundary operations live next door in Chapter 4.3; the grid-interactive ride-through and reactive support live in Chapter 4.10.

HV-to-MV step-down and the GIS vs AIS switchgear fork

The first real equipment fork is the switchgear technology that builds the MV bus: air-insulated (AIS) or gas-insulated (GIS). AIS uses air as the dielectric between live parts, so it is cheaper per bay, easier to inspect visually, and serviceable by any competent MV crew — at the cost of footprint, because air needs clearance. GIS encapsulates the busbars and switches in grounded metal enclosures filled with an insulating gas, collapsing the same electrical function into roughly a third of the footprint, sealing it against dust, salt, and humidity, and effectively eliminating weather exposure — at a higher capital cost, longer lead time, and a maintenance model that demands gas-handling discipline and specialist crews.

For a land-rich greenfield in a temperate climate, AIS is often the rational default: cheaper, faster to source, easier to operate. GIS earns its premium where footprint is the binding constraint (urban or constrained sites, where every square meter of yard is square meters not spent on white space), where the environment is hostile (coastal salt fog, desert dust, the Gulf and Singapore buildouts), or where a compact, weather-sealed yard shortens the civil works on the critical path. The decision is not purely electrical — it is a footprint-vs-capital-vs-schedule trade that interacts with your site constraints from Chapter 3.1.

What changed the calculus permanently is the insulating gas. Conventional GIS used sulphur hexafluoride (SF6) — a superb dielectric and arc-quenching medium, and the most potent greenhouse gas regulated under the Kyoto basket, with a global-warming potential roughly 24,000–25,000 times that of CO2. The EU F-Gas Regulation (Regulation (EU) 2024/573) now bans placing on the market new MV switchgear using SF6 (and other F-gases) for equipment rated up to 24 kV from 1 January 2026, extending to 24–52 kV by 2030. That moves SF6-free from a sustainability preference to a hard compliance gate for new European MV substations — and, because the major OEMs build to one global platform, increasingly the default offering worldwide.

Switchgear technology & gas decision
OptionFootprintCapex / lead timeEnvironmental fitRegulatory status (EU, 2026)Best fit
AIS, air-insulatedLargest — needs clearanceLowest capex; shortest leadTemperate, clean-air sitesCompliant (air dielectric)Land-rich greenfield, benign climate
GIS, SF6Smallest (~1/3 of AIS)High capex; long leadExcellent (sealed)Banned new <=24 kV from 1 Jan 2026Legacy only; no longer specifiable new in EU MV
GIS, SF6-free (clean air / dry-air-N2)Compact (slightly larger than SF6)High capex; lead times normalizingExcellent; zero/low-GWPCompliant; preferred pathConstrained/coastal sites needing GIS, EU compliance
GIS, fluoronitrile mix (C4-FN, e.g. Novec 4710)Near-SF6 compactnessHigh capex; supplier-specificLow GWP (~1/2000 of SF6)Compliant under current rules; PFAS scrutiny risingHigher-voltage GIS where dry-air dielectric falls short
GIS/AIS and gas-medium tradeoffs for the on-site MV substation. SF6-free regulatory dates are EU F-Gas Regulation (EU) 2024/573; alternatives are vendor-named technologies. GWP figures are order-of-magnitude.

MV distribution architecture: radial, selective, or ring

This is the topology fork — the one that sets the blast radius. Three families dominate, and they trade capital and complexity directly against fault tolerance and concurrent maintainability.

Radial is the simplest and cheapest: one source feeds a bus feeds a load, with no alternate path. A fault or a maintenance outage anywhere upstream drops everything downstream. It is the right answer only where the load tolerates the outage — batch-inference halls, non-critical mechanical, or any block where a trip is an inconvenience rather than a revenue event. Build a revenue inference hall on a pure radial and every MV maintenance window is a planned outage you cannot avoid.

Primary-selective and secondary-selective give each load two sources — a normal feed and an alternate — with an automatic or manual transfer between them. Secondary-selective is the workhorse of the critical data center: dual MV feeds into a double-ended substation with a normally-open tie breaker, so the loss of one transformer or one feeder transfers the load to the healthy side rather than dropping it. This is the topology that delivers concurrent maintainability — you can de-energize one side for service while the other carries the hall — and it is the natural MV partner to a 2N or block-redundant power chain (see Chapter 12.1). Its cost is the second feeder, the second transformer, and the tie switchgear and transfer scheme.

Ring (and ring-main / breaker-and-a-half at the higher end) closes the distribution into a loop so that any single segment can be isolated with power still reaching every load from the other direction. A ring buys you the ability to lose any one cable or switch without dropping a load, with less total feeder copper than a fully duplicated radial-pair — at the cost of a more complex protection scheme, because a ring with sources at multiple points needs directional and differential protection to figure out which way the fault current is flowing. For gigawatt campuses, the HV yard itself often goes to a breaker-and-a-half or double-bus arrangement so that any single breaker can be maintained or can fail without islanding the campus.

MV distribution architecture decision
ArchitectureRedundancySingle-fault load lossConcurrent maintainable?Relative MV capexBest-fit workload
RadialNone (N)Full downstream blockNoLowestBatch inference; non-critical; checkpointable training (cost-led)
Primary-selectiveDual source upstreamPartial; transfer on feeder lossPartiallyModerateMixed halls; bridge capacity
Secondary-selective (double-ended)2N-class at the busNone after transfer (one side)YesHighOnline inference; revenue halls; 2N power chains
Ring / ring-mainLoop — any one segment isolableNone (re-fed from other direction)YesHigh; less feeder copper than dual-radialCampus MV backbone; multi-source/BTM sites
Breaker-and-a-half (HV/MV bus)Bus + breaker redundancyNone for single breaker faultYesHighestGigawatt factory main bus; islanding-capable campuses
How the topology fork trades capital against blast radius and maintainability. 'Single-fault load loss' is the load dropped by one worst-case component fault before transfer. Maps to the workload archetypes of Chapter 1.1.

Read this table against the workload archetypes from Chapter 1.1, not against a generic Tier number. A synchronous training cluster that restarts from a checkpoint on any node failure does not value the marginal nine that a ring buys over a radial at the MV layer — its reliability budget is better spent on checkpoint cadence and hot spares (the goodput-vs-availability argument in Chapter 12.2). An always-on inference hall, where every MV trip is a breached SLO and lost revenue, justifies the secondary-selective or ring premium outright. The single most common MV mis-scope is the inverse of each: a pure radial under a revenue hall (planned outages you can't avoid), or a fully duplicated, ring-fed MV fabric under a checkpointable training pod (selectivity the job ignores).

IEC vs ANSI, IEC 61850, and the regional grid code

Switchgear and protection come in two standards lineages, and which one you build to is set by region — but it is a genuine design fork because it reshapes the relay panel, the ratings, the test regime, and the spares. The IEC lineage (IEC 62271 for switchgear, IEC 60255 for relays, IEC 61850 for substation communications) governs most of the world outside North America; the ANSI/IEEE lineage (IEEE C37 for switchgear and breakers, ANSI device numbers, NEC/NFPA 70 for installation) governs the US and ANSI-aligned markets. They differ in frequency (50 vs 60 Hz), in standard voltage steps, in fault-duty definitions and rating philosophy, and in the device-numbering and documentation conventions your relay engineers and operators will live in. A global operator building one campus to IEC and the next to ANSI is, in protection terms, running two parallel engineering practices — different relays, different test sets, different spares, different qualified crews.

Cutting across both is IEC 61850, the substation-automation standard that has reshaped how protection is wired. Instead of a relay panel hardwired with copper from every CT, VT, breaker, and trip coil, 61850 puts a station bus (MMS over Ethernet for SCADA and engineering) and a process bus (Sampled Values for analog measurements, GOOSE for fast inter-relay tripping and interlocking) on a redundant Ethernet network. The payoff is dramatic: less copper, faster and testable interlocking, vendor-interoperable relays described by a single SCL configuration file, and a substation you can model and FAT digitally. The cost is that protection now depends on a deterministic network — GOOSE messages must arrive in a few milliseconds, the network needs PRP/HSR redundancy, and a 61850 substation is a cyber-physical asset that falls squarely inside the NERC CIP and OT-security scope of Chapter 4.3 and Chapter 11.10. The 2026 default for a new large substation is a 61850 station-bus design; full process-bus adoption is the frontier, traded against the maturity of your protection vendor and crew.

Finally, the campus does not just consume power — it is a grid actor bound by the regional grid code, and that obligation reaches back into this layer through the protection settings. In Europe the connection must comply with the ENTSO-E Requirements for Generators (RfG) and, for demand, the Demand Connection Code; the GB market adds the National Grid ESO Grid Code; North America adds NERC reliability standards and, after the May 2026 Level 3 alert on data-center load losses, a hard expectation of fault ride-through. The grid code dictates the voltage and frequency envelope your protection must not trip within — which is exactly the inverse of normal protection's job and the source of real tension covered in Chapter 4.10.

Deep dive: IEC 61850 station bus vs process bus, and why it's a protection-availability decision

The instinct is to treat 61850 as a wiring-reduction convenience. It is more than that — it changes the failure modes of your protection. In a hardwired panel, a CT signal reaches a relay over a dedicated copper pair; the failure is open or shorted wiring, diagnosable and local. With a process bus, that same CT signal is digitized at a merging unit and published as Sampled Values onto an Ethernet segment that many relays subscribe to. The wiring shrinks to a fiber, the engineering simplifies, and a single merging unit or network fault can now affect every relay that subscribed to it. That is why process-bus designs mandate PRP (Parallel Redundancy Protocol) or HSR (High-availability Seamless Redundancy) — two independent networks carrying duplicate frames so that a single switch or link failure is invisible to protection — and why the GOOSE and SV performance classes (trip messages in ~3 ms) are not negotiable.

The decision, then: a station-bus-only design keeps the analog measurement hardwired to the relay (familiar, robust, proven) while moving SCADA, interlocking, and engineering onto Ethernet — the conservative 2026 default. A full process-bus design digitizes the measurements too, maximizing the copper savings and enabling a fully digital, software-defined substation — but it demands a mature merging-unit supply, redundant deterministic networking, and a protection crew fluent in SCL, GOOSE, and SV testing. For a multi-gigawatt operator standardizing dozens of substations, process bus is the strategic endgame; for a first AI campus on an aggressive schedule, station bus is the lower-execution-risk choice. Either way, the substation is now an OT network asset — segment it, monitor it, and bring it into the security model of Chapter 11.10.

Pod/block transformer sizing and the granularity decision

The pod (or block) transformer is the unit of capacity you energize, fault, protect, and maintain. Its sizing is a granularity decision with direct consequences for blast radius and for how the density ramp lands. A few large transformers minimize part count, footprint, and unit cost, and simplify the MV bus — but each one is a larger single point of failure, and each takes a bigger bite of stranded capacity when a hall is half-populated during ramp. Many smaller transformers shrink the blast radius (a transformer fault drops a pod, not a hall), let you energize capacity in finer increments that track the GPU delivery curve, and align naturally to a pod-as-fault-domain design (see Chapter 12.1) — at the cost of more bays, more relays, more footprint, and a higher aggregate capex.

The MVA math is unglamorous and decisive. Provision roughly a 10% MVA cushion over the MW load to cover power factor (0.95–0.99) and headroom, derate for ambient and altitude, and size the redundancy to the topology: an N+1 transformer scheme to serve ~150 MW peak commonly lands on three ~80 MVA units, one of which is the spare any one can become. But the figure that should haunt the sizing exercise is the density ramp. A pod transformer sized for today's 40–60 kW inference racks does not have the headroom for a 132 kW NVL72 generation, let alone a ~600 kW Kyber-class rack — and a transformer is a 100+ week lead-time item you cannot swap on a quarter's notice (see Chapter 2.3 and the lead-time figures in the key numbers below). The doctrine is the same one that governs floor loading and water in Chapter 1.1: provision the irreversible MV substrate — feeder ampacity, transformer pad and MVA headroom, switchgear bus rating — for the ramp you can foresee, while keeping the reversible LV fit-out matched to the generation actually shipping.

1 Jan 2026
EU F-Gas ban on new SF6 MV switchgear <=24 kV (24-52 kV by 2030)
2026EU F-Gas Regulation (EU) 2024/573; European Commission
~24,000x
SF6 global-warming potential vs CO2 — why the F-gas phase-out is forced
2026IPCC / EU F-Gas Regulation
~1.1-1.2x
fault current from inverter-based sources (vs 5-10x for synchronous machines) — breaks conventional overcurrent grading
2025NERC / IEEE PES-PSRC; NREL IBR protection studies
~80 MVA x3
typical HV/MV transformer unit, N+1, to serve ~150 MW peak; ~10% MVA cushion over MW
2026domain-research synthesis; SemiAnalysis Datacenter Anatomy Pt 1
75 MW
ERCOT 'large load' threshold forcing full interconnection + protection study (25 MW = modeling)
2026ERCOT Large Load Integration; SB6
100+ wk
MV switchgear lead time; ~128 wk standard HV transformer, ~144 wk GSU-class
2025Wood Mackenzie; pv magazine; supply-chain synthesis
~1,500 MW
instantaneous data-center load lost on a single 230 kV fault — trigger for NERC's rare Level 3 alert (ride-through now mandatory)
2026NERC Level 3 Alert; Utility Dive
~1/3
GIS footprint vs equivalent AIS lineup — the compactness premium for constrained/hostile sites
2026Siemens Energy / ABB MV switchgear synthesis

Protective relaying and the IEC 61850 protection scheme

Protection is the system that decides, in milliseconds, that something has faulted and isolates the smallest possible piece of the network to clear it. On an MV campus it is built from numerical relays implementing a layered set of functions — referenced by ANSI device numbers in the US lineage, by IEC 60255 function names in the IEC one — coordinated so that the relay closest to the fault trips first and the ones upstream hold, preserving as much of the network as possible. The core elements on a data-center MV system are overcurrent and earth-fault (ANSI 50/51, 50N/51N) on every feeder, differential protection (87T on transformers, 87B on busbars) to clear in-zone faults instantaneously without waiting for time grading, directional elements (67) wherever current can flow two ways (rings, paralleled sources, on-site generation), and transfer and bus protection tied into the automatic source-transfer scheme of a secondary-selective substation.

In a 61850 station, much of the interlocking and fast tripping that used to be hardwired now rides on GOOSE messages — a relay seeing a busbar fault publishes a trip that other relays act on in a few milliseconds, enabling fast bus-blocking and breaker-failure schemes without copper between panels. This is faster and testable, but it makes the protection a function of the substation network, which is exactly why the redundancy (PRP/HSR) and the security posture matter. The protection design is not a downstream detail of the topology — it is the thing that makes the topology's redundancy real. A ring that isn't backed by directional and differential protection isn't a ring; it's an expensive radial that can't tell which way the fault is.

The protection & coordination study as a deliverable

The single most important paper artifact this layer produces is the protection and coordination study — a stamped engineering deliverable that no lender, no AHJ, and no insurer accepts the energization of a large campus without. It has four canonical parts, each a study in its own right.

  • Short-circuit / fault-duty analysis. Computes the available fault current at every bus, from the utility's source impedance through the on-site transformers and generation, and proves every breaker and bus is rated to interrupt and withstand it. Under-rate a breaker against the available fault current and it can fail catastrophically trying to clear a fault — the most dangerous error in the building.
  • Time-current-curve (TCC) coordination. Per IEEE 242 (the 'Buff Book'), plots every protective device's time-current characteristic so that for any fault, the nearest upstream device trips first and the ones above it hold — selectivity. Mis-coordinate and a feeder fault trips the main, turning a pod outage into a hall outage.
  • Arc-flash incident-energy analysis. Per IEEE 1584 and NFPA 70E (and CSA Z462 / regional equivalents), computes the incident energy at each working point so that PPE categories, approach boundaries, and labels are correct — and so that the protection is fast enough to keep arc-flash energy survivable. This is a life-safety deliverable, not a formality; it ties directly into the qualified-worker EHS interface of Chapter 6.9.
  • Relay setting sheet. The pickup, time-dial, and curve-type settings derived from the above, that the commissioning team loads into every relay and the operations team maintains for life.

The study is not one-and-done. It is re-run whenever the source impedance changes — a new utility line, a different transformer, the addition of on-site generation — because every one of those moves the fault current and can break the coordination that was selective yesterday. And in the BTM era, the change that breaks it most often is the addition of inverter-based on-site power.

Deep dive: the inverter-dominated fault-current problem — why islanding breaks your overcurrent scheme

Conventional overcurrent protection rests on a physical assumption that has held for a century: a fault draws a large current — 5 to 10 times rated, fed by the inertia and magnetic flux of synchronous machines and the stiff utility grid — and that large current is what relays detect and grade against. The whole TCC-coordination edifice of IEEE 242 is built on a fault current that is unmistakably bigger than load current.

Behind-the-meter generation breaks that assumption. When the campus islands on inverter-based sources — BESS, solar, and increasingly the grid-following and grid-forming inverters that interface gas and fuel-cell plants (see Chapter 4.8 and Chapter 3.5) — those inverters are current-limited by their power electronics to roughly 1.1–1.5x rated current, with at least one phase typically capped near 120%. A fault now looks, to a conventional overcurrent relay, almost indistinguishable from an overload. The relay may fail to pick up at all, may pick up too slowly to be selective, or may misjudge direction — the classic failure modes that NERC and IEEE flagged as IBR penetration rose, and that drove standards like PRC-028 on IBR protection coordination.

The consequence for this layer is concrete: a protection scheme designed only for the stiff grid is not valid in island mode. The study must be run for both source configurations, and the islanded case typically forces a shift away from pure overcurrent toward schemes that don't depend on fault-current magnitude — differential protection (87, which compares current in vs out of a zone and doesn't care how big the fault is), directional and voltage-restrained elements, and increasingly communications-assisted schemes over 61850 GOOSE that coordinate relays by message rather than by current grading. Specify the campus to island on BTM generation without re-running protection for the inverter-dominated case, and you have a yard that is protected on the grid and unprotected the moment it needs to stand alone — exactly when reliability matters most.

Campus-scale single-line patterns: 100 MW to 1 GW

At campus scale the topology and granularity decisions compose into a handful of recurring single-line patterns, and the pattern scales with the megawatt count. A 100–300 MW campus typically takes a 69–138 kV interconnection into a customer substation with a double-ended (secondary-selective) MV bus, feeding a small number of MV distribution rings or selective feeders out to block transformers — enough redundancy for concurrent maintainability without the cost of a full transmission yard. A 500 MW–1 GW factory takes a 230/345/500 kV interconnection into an HV yard built on a breaker-and-a-half or double-bus arrangement (so any single breaker is maintainable or can fail without islanding the campus), multiple HV/MV transformer banks, and an MV backbone — often a ring or a set of selective feeders — that subdivides the campus into independent pods or blocks, each its own fault domain with its own transformer, switchgear, and protection zone.

The unifying design principle across the scale range is fault-domain alignment: the MV architecture should partition the campus so that a single electrical fault is contained to one workload-meaningful unit — a pod, a hall, a block — and never propagates further. That partition should match the fault domains the resilience design wants (see Chapter 12.1), match the granularity at which you energize capacity during the density ramp, and match the workload's tolerance for losing that unit. When the MV blast radius, the energization increment, and the workload fault domain are the same boundary, the substation is doing its job. When they diverge — a fault drops more than the workload can absorb, or you can only energize in increments bigger than your GPU delivery — the single-line is fighting the building.

This chapter sits in the middle of the power chain. Upstream, the voltage taxonomy and the full HV-to-chip topology are set in Chapter 4.1; the queue, interconnection, and speed-to-power that deliver the megawatts to the fence are in Chapter 3.2. The HV-yard ownership, NERC registration, and utility-boundary operations are next door in Chapter 4.3, and the grid-interactive ride-through and reactive support the grid code demands are in Chapter 4.10. Downstream, the transformers and the AI non-linear-load / harmonics problem are in Chapter 4.4; the SST path straight to 800 VDC that can collapse this whole MV-to-LV chain is in Chapter 4.4 and Chapter 4.7; LV busway and rack power in Chapter 4.6; grounding, bonding, and the system-earthing regime in Chapter 4.11. The on-site generation that forces the inverter-fault-current rethink is in Chapter 3.5 and Chapter 4.8; the procurement lead times that gate the whole yard are in Chapter 2.3; the fault-domain and redundancy framing in Chapter 12.1 and the goodput-vs-availability rethink in Chapter 12.2; and the OT-security model for the 61850 substation in Chapter 11.10.