Chapter 3.1
Site Selection Strategy & the Reordered Criteria Hierarchy
Site selection used to ask 'where is the cheapest land near fiber and users?' — in the AI era it asks 'where can I energize 100 MW to multiple gigawatts in 24–48 months?', and getting that ordering wrong strands a billion-dollar campus on a slab that cannot be moved.
What you'll decide here
- Whether your workload is training-shaped (power-first, latency-indifferent, can chase stranded megawatts anywhere) or inference-shaped (latency-bound, must sit near users) — this decides whether speed-to-power or proximity is your top screen.
- Whether to concentrate in one gigawatt campus (cheapest fabric, single interconnection fight) or spread a distributed multi-region fabric (more sites, faster aggregate energization, async-tolerant training only).
- The build modality — greenfield self-develop, build-to-suit, powered shell, or wholesale/retail colocation — which trades 24–36 months of control against days-to-months of speed.
- Which screen is your binding constraint, so the funnel pass/fails on it FIRST: in 2026 that is almost always firm-power-by-a-date, not land, fiber, or incentives.
- The dollar value of speed for your specific workload — because at ~$10–12B of revenue per GW-year, six months of earlier energization is the largest single line item the site decision controls.
For roughly two decades, data center site selection was a settled discipline with a stable ranking of criteria. You screened for latency to population centers and peering hubs, then for land (cheap, flat, contiguous, zoned), then for fiber routes and carrier diversity, then for fiscal incentives (sales-tax exemptions, property-tax abatements), with power treated as a utility commodity you simply ordered. That hierarchy produced Northern Virginia — the largest data center cluster on earth, carrying an estimated 35% of global internet traffic — precisely because it optimized the old criteria: dense fiber, proximity to the federal market, generous Virginia incentives, and, once upon a time, abundant cheap power.
The AI era inverted the ranking. The binding input is no longer chips, land, or fiber — it is megawatts energized by a date. US generator interconnection queues now hold on the order of 2,000+ GW of requested capacity (LBNL Queued Up, end-2024 data, ~2.06–2.29 TW), and on the load side ERCOT's large-load queue alone quadrupled in a single year to roughly 410 GW by early 2026, ~87% of it data centers. PJM's application-to-commercial-operation timeline stretched from under two years in 2008 to more than eight years by 2025. When firm power is the scarcest, slowest input in the entire project, every other criterion becomes a tiebreaker among sites that have already cleared the power gate — and most candidate sites never clear it.
This chapter is the framework for the reordered hierarchy. We name the new screen order, show how the workload archetype (carried in from Chapter 1.1) drives the siting search, weigh the single-campus vs distributed-fabric fork, lay out the four build modalities and what each one buys, and walk the site-selection funnel as a sequence of pass/fail gates ordered by the time-value of speed. Site selection is the most irreversible decision in the project, because you cannot move a slab once it is poured. Get the ordering wrong and the cost compounds for the life of the asset.
The 2024–2026 reordering: power-and-speed first
The reordering is not a re-weighting of the same list — it is a change in what gates the deal. In the old model, every criterion was a continuous score and you summed a weighted matrix; a site weak on power could compensate with strong fiber and incentives. In the new model, speed-to-power is a pass/fail gate that runs first, and a site that cannot show a credible path to firm capacity by your in-service date is eliminated before its fiber, land, or tax score is ever computed. The matrix still exists — but only for the survivors of the power gate.
Why the discontinuity rather than a slope? Because the AI campus load is enormous, dense, and growing faster than transmission can be built. A single hyperscale AI building now draws 100+ MW; a gigawatt campus draws as much as a mid-sized city. The grid cannot absorb that on the old timeline, so power moved from a commodity you order to a multi-year, attrition-ridden process you must win — and the process, not the construction, is the long pole. AI data center construction itself takes only 12–18 months; the interconnection behind it takes 3–8+ years. That gap is the entire story of 2024–2026 siting, and it is why 'bring-your-own-power' (behind-the-meter gas, batteries, fuel cells, eventually SMRs) went from afterthought to default first-power strategy — roughly 101 GW of BTM gas had been announced cumulatively by 2026.
| Criterion | Pre-2023 rank | 2026 rank | Gate type in 2026 | Binds hardest for |
|---|---|---|---|---|
| Speed-to-power (firm MW by a date) | Assumed / commodity | 1 | Hard pass/fail, runs first | All AI workloads; training most |
| Power cost & structure (LMP, capacity, congestion) | Mid | 2 | Scored after power gate | Training, batch (cost-led) |
| Latency / proximity to users | 1 | 3 | Hard gate for inference only | Online & edge inference |
| Water & climate (cooling strategy) | Low–mid | 4 | Gate where water-stressed; else scored | Dense liquid-cooled halls |
| Land (size, contiguity, zoning) | 2 | 5 | Scored; rarely the binding constraint | Gigawatt campuses |
| Fiber & network connectivity | 3 | 6 | Scored; secondary screen | Inference; multi-site training |
| Fiscal incentives & tax | 4 | 7 | Scored tiebreaker, durability-discounted | Marginal-economics deals |
Speed-to-power as the gating screen; workload-driven siting
Speed-to-power is the gate because speed is worth more than almost anything else the site decision controls. The arithmetic is direct: a gigawatt of AI capacity generates on the order of $10–12B of revenue per year (SemiAnalysis — a contested, single-source figure), so accelerating 200 MW into service six months early is worth roughly $1–1.2B of incremental revenue you would otherwise never earn — against a GPU fleet that is depreciating on a 2–3 year economic clock the moment it is racked. A site that is cheaper on power but two years slower to energize is not cheaper; it is a different, worse business. This is the time-value of speed, and it dominates the funnel.
Workload then shapes how you satisfy the gate. Training-shaped load is flexible and curtailable in ways that unlock faster energization: a synchronous training job already checkpoints and resumes, so it tolerates non-firm or curtailable interconnections, bridge power, and phased energization in ways an always-on inference SLA cannot. That flexibility is the accelerant — utilities and ISOs are far more willing to fast-track a load that will curtail during system stress (the Duke Nicholas Institute found ~100 GW of US headroom available at just 0.5% annual curtailment). Inference-shaped load is the opposite: it needs firm, high-availability power because every megawatt-hour not served is a breached SLA and lost revenue, which is exactly why inference pays the proximity-and-firmness premium. The siting consequence is direct — training can use the speed-to-power tricks that inference cannot, so the same megawatt is easier to land for a training campus than for an inference fleet. → mechanics of the queue and flexible interconnection in Chapter 3.2; the energy-supply strategy that backs the gate in Chapter 3.4 and Chapter 3.5.
Single gigawatt campus vs distributed multi-region fabric
Once the workload sets the funnel, the next structural fork is concentration vs distribution: do you fight for one enormous interconnection at a single gigawatt campus, or spread the same megawatts across multiple smaller, faster-to-energize sites stitched together by long-haul fiber? This is not a cosmetic choice — it changes the network architecture, the redundancy posture, and which workloads the facility can even run.
A single campus is the natural home of frontier pre-training. One tightly-coupled supercomputer wants its GPUs in one place because synchronous all-reduce collectives run on every training step, and inter-site latency (even at light-speed, ~5 ms per 1,000 km one-way over fiber) is fatal to a synchronous job that must wait for its slowest straggler. Concentration also wins on cost: one substation, one fiber build, one fiber plant, one security perimeter, one shared cooling and power yard. The price is concentration risk — you are betting the whole campus on a single interconnection fight, a single grid node's reliability, and a single permitting jurisdiction's goodwill. If that one node slips two years, the entire investment slips with it.
A distributed fabric spreads the bet: several 100–300 MW sites, each with its own faster interconnection, aggregated to gigawatt scale. The speed argument is real — multiple medium interconnections in less-constrained markets can energize faster in aggregate than one mega-interconnection in a saturated hub, and you diversify jurisdiction, grid, and weather risk. But distribution only works for workloads that tolerate the inter-site latency: inference geo-distributes naturally (independent requests), and asynchronous or hierarchical training schemes (multi-datacenter training) can tolerate bounded staleness across sites. Run a strictly-synchronous pre-training job across a distributed fabric and you pay a goodput tax on every step that can erase the speed advantage you bought. The fork therefore loops back to the workload: distribution is a power-and-speed strategy that only synchronous-tolerant or inference workloads can spend.
| Dimension | Single gigawatt campus | Distributed multi-region fabric |
|---|---|---|
| Speed-to-power | One mega-interconnection — slowest, highest-stakes fight | Several smaller interconnections — faster in aggregate |
| Fabric / latency | All GPUs co-located; non-blocking back-end; sub-µs reach | Inter-site fiber (~5 ms/1,000 km); only async/inference-tolerant |
| Cost structure | Shared substation, fiber, cooling, security — cheapest/MW | Duplicated overhead per site; higher unit cost |
| Risk concentration | All eggs in one grid node / jurisdiction / weather basin | Diversified across grids, regulators, climates |
| Best-fit workload | Frontier synchronous pre-training; co-located RL | Online/edge inference; async & hierarchical training |
| Permitting / social license | One large, high-visibility opposition target | Many smaller, lower-profile approvals |
Build-to-suit vs powered shell vs colocation vs greenfield self-develop
The third fork is how you take the site to operating capacity — and it is the lever that most directly trades speed against control and capital. The procurement archetypes from Chapter 1.6 map onto siting as four delivery modalities, each landing at a different point on the speed-control-capital surface.
Greenfield self-develop is buying raw land and originating the entire stack yourself — interconnection, substation, shell, power, cooling, fit-out. It gives total control over density, voltage architecture, and cooling modality, and the lowest long-run unit cost at scale, at the price of 24–36 months to a live cluster, the deepest capital commitment, and ownership of every permitting and interconnection risk. It is the right call only for a durable, well-forecast workload where you intend to operate at scale for years. Build-to-suit (BTS) hands the development and construction risk to a developer who builds to your spec on a long (often 15-year) lease — you get a purpose-built liquid-ready hall without fronting the land-and-shell capital, but you inherit the developer's site, including its interconnection position, and you pay a credit-tenant lease premium. Powered shell is the speed-and-optionality middle: a developer delivers an energized, weather-tight building shell with power and water brought to the boundary, and you complete the IT fit-out. It preserves your fit-out and density decisions while letting someone else absorb the slowest, most uncertain part — the interconnection and the shell — and it is the modality that most cleanly converts an irreversible decision into a deferrable one. Colocation (wholesale or retail) rents someone else's already-powered hall, buying a live cluster in 6–12 months at capex-light terms, at the cost of a shared shell and the least control over the underlying power and cooling design.
The decision is rarely all-or-nothing. The dominant 2026 pattern is a hybrid: anchor durable base load in a self-develop or BTS campus, take a powered shell to preserve a density ramp you have not committed to, and rent colocation or neocloud for burst and bridge capacity while the owned interconnection matures. → the build-vs-buy-vs-rent NPV is scored in Chapter 1.8; the long-lead equipment that gates every modality is in Chapter 2.3.
| Modality | Time-to-live-cluster | Capital posture | Control over design | Interconnection risk owner | Best-fit |
|---|---|---|---|---|---|
| Greenfield self-develop | 24–36 months | Highest (full capex) | Maximal — land to fit-out | You | Durable, large, well-forecast workload |
| Build-to-suit (BTS) | 18–30 months | Lease + IT (capex-light) | High — built to your spec | Developer (priced into lease) | Purpose-built capacity, balance-sheet-light |
| Powered shell | 12–24 months | Shell lease + IT fit-out | Fit-out & density yours; shell fixed | Developer | Density-ramp optionality; fast firm power |
| Colocation (wholesale/retail) | 6–12 months | Opex-led (lease + IT) | Least — shared shell/power | Operator (already energized) | Speed, uncertain demand, bridge capacity |
The site selection funnel and the time-value of speed
A defensible site-selection program is a funnel of ordered gates, not a single weighted spreadsheet. The ordering matters as much as the criteria, because the point of a funnel is to fail candidates cheaply on the binding constraint before spending diligence dollars on the rest. In 2026 the binding constraint is firm-power-by-a-date, so it gates first. Run the funnel in the old order — score everything, then check power last — and you will spend months of geotech, fiber, and incentive diligence on sites that were never going to energize in time.
- Gate 1 — Speed-to-power (hard pass/fail). Is there a credible path to your required firm (or curtailable, for training) megawatts by your in-service date? This means a real read on the local interconnection queue, substation headroom, transmission proximity and voltage class, and any BYOP bridge option. Sites without a path die here, regardless of every other virtue.
- Gate 2 — Power cost & structure. Among sites that can energize, what is the all-in cost: nodal/LMP pricing, congestion exposure, capacity and demand charges, curtailment terms, and who pays for network upgrades? Power is 25–60% of TCO, so this is the dominant economic screen among survivors. → Chapter 3.3.
- Gate 3 — Latency (hard gate for inference; skip for training). Does the site sit inside the workload's latency budget to its users (~1,000–1,500 km practical reach for sub-50 ms)? For training this gate is a no-op; for inference it can eliminate every power-rich basin in the country. → Chapter 3.6.
- Gate 4 — Water & climate. Can you cool here without a water fight? In water-stressed regions this is a gate; the 2026 move is to design water out (closed-loop, near-zero-evaporation) so arid, power-rich sites become viable. → Chapter 3.7.
- Gate 5 — Land, geotech, flood, zoning. Enough contiguous, developable, appropriately-zoned acreage (GW campuses want 500–1,000+ acres, only ~30–40% built) on bearable soil outside the floodplain. Rarely the binding constraint, but a hard stop when it bites. → Chapter 3.8.
- Gate 6 — Fiber, fiscal, social license (scored tiebreakers). Carrier-diverse fiber, durability-discounted incentives, and a credible community-and-permitting path break ties among the survivors. → Chapter 3.9, Chapter 3.10, Chapter 3.11.
The funnel narrows fast precisely because Gate 1 is so restrictive in 2026 — which is the point. The full weighted scoring matrix, hard pass/fail templates, and the stage-gated desktop→field→binding diligence sequence are built out in Chapter 3.13; this chapter establishes the order, which is the part most programs get wrong.
Deep dive: why the funnel must fail on power first — a worked elimination
Consider a 500 MW campus decision evaluated two ways. Run the old funnel and you start with a weighted matrix: latency 25%, land 20%, fiber 20%, incentives 20%, power 15%. A Northern Virginia site scores beautifully — top-decile fiber, mature land market, rich incentives, low latency — and lands at the top of the matrix. You spend nine months on geotech, fiber engineering, entitlement, and incentive negotiation. Then the interconnection study comes back: firm capacity available in 2033. The entire diligence spend is sunk on a site that fails the only constraint that mattered, and your GPUs — already on order against a 2-3 year economic life — have nowhere to land.
Run the 2026 funnel and Gate 1 kills that site in week one: no firm path to 500 MW before 2030, eliminated. The funnel instead surfaces a West Texas or Midwest site in a less-constrained part of the grid, possibly with a behind-the-meter gas bridge to first-power and a curtailable interconnection for the durable load — energized in 2027. It scores worse on fiber and latency, but the workload is training, so latency is a no-op and the fiber is adequate. You have traded a perfect score on criteria that do not bind for a passing score on the one that does. The time-value math closes it: two extra years of a 500 MW campus operating is on the order of $10B+ of revenue at $10-12B/GW-yr — a number so large it swamps every other line item in the site decision. The funnel's job is to never let a non-binding criterion outrank the binding one.
Deep dive: the market-map redraw and what it means for the funnel
The reordered hierarchy is visibly redrawing the global market map, and the funnel is the mechanism. Because Gate 1 (speed-to-power) now dominates, capital is flowing away from the constrained legacy primaries — Northern Virginia (Dominion/PJM shortfall, zoning reform ending by-right development) and California — and toward markets that can actually energize: Texas/ERCOT, on a path to >40 GW by 2028 (~1/3 of US demand) on the back of a single-ISO regulator, a 765 kV backbone build, and SB6's large-load framework; US secondary markets (Columbus, Salt Lake City, San Antonio, Reno, Indianapolis, the Permian) that offer headroom the primaries have exhausted; the Nordics (firm renewables, <10 °C climate enabling ~8,000 free-cooling hours/year and PUE near 1.09, heat reuse into district heating); and the Gulf (sovereign capital and abundant gas, on a 1 GW → 3.3 GW 2025–2030 trajectory).
For the practitioner the lesson is that the funnel's Gate-1 read is geographically specific and time-sensitive: a market that was 'full' for power last year may have opened via a 765 kV line or a BYOP regime, and a market that looked open may have hit a moratorium (Ireland's EirGrid constraints, Singapore's pause). The reordered hierarchy does not just change which criterion ranks first — it changes which dots on the map survive the first gate, and that survivor set moves faster than a five-year siting study can track. The comparative cluster deep-dives — US (NoVA/PJM, ERCOT, secondary), EU (Ireland, Nordics, Iberia), Middle East, and APAC — and the non-US interconnection regimes that gate them live in Chapter 3.13 and Chapter 3.2; the sovereignty and export-control overlays that gate where compute can legally sit are in Chapter 3.12.
Reversible vs irreversible: where to spend the option premium
Site selection contains the most irreversible decisions in the entire lifecycle, so the discipline from Chapter 1.1 applies with full force: over-build or hedge the irreversible decisions now; defer the reversible ones and keep them cheap to change.
Irreversible (decide once, at siting): the site itself — you cannot move a slab; the interconnection capacity and voltage class — the queue slot is the single scarcest asset in the project and you do not get it back; the structural and water provisioning of the substrate — floor loading for ~3,000–5,000 lb wet racks, facility water headroom, and pipe-rack space for a density ramp you have not committed to; and the macro climate-and-water basin you have planted in. These are where you spend the option premium — provisioning headroom you may not use, because retrofitting it mid-life is prohibitively costly or impossible.
Reversible (defer, re-decide cheaply): the build modality (a powered shell or colo lease preserves an exit and a fit-out re-decision); the accelerator generation within a fixed power/cooling envelope; the workload mix ratio within an archetype; and the energy-supply tactic layered on top of a secured interconnection (grid-only vs grid-plus-bridge vs hybrid — re-decidable as the BYOP and PPA markets move). The strategic move is to convert irreversible decisions into reversible ones wherever the premium is cheap: a powered shell instead of a full self-build preserves IT-fit-out optionality, and reserving floor loading and water turns an irreversible density ceiling into a deferrable choice.