Guide › Day-2 Operations, Upgrades & Lifecycle › 14.9

Chapter 14.9

Hardware Refresh, Depreciation Strategy, Decommissioning & ITAD

Refresh is the moment the depreciation assumption you underwrote in Chapter 1.8 stops being an accounting choice and becomes a physical operation — and how you execute it (when you pull the part, where it cascades, how you sanitize it, who you sell or shred it through) decides whether the residual value the whole financing case rests on is real money or a write-down with extra steps.

GOODPUTPOWER-BOUNDDENSITY-RAMP

What you'll decide here

What triggers refresh for a given fleet slice — calendar (the book-life schedule), efficiency (perf-per-watt against the next generation), failure-rate (the wear-out knee), or power (a denser part that earns more per scarce megawatt) — because the trigger you pick decides whether you refresh too early and strand residual value or too late and strand the interconnection slot.
Whether a retired accelerator cascades (frontier-training → fine-tuning → online inference → batch/internal) or exits to ITAD — the single decision that determines whether the 5–6 year book life is honest or fiction, and the canonical economic argument lives in Chapter 1.8.
Your data-sanitization standard for GPU/HBM and NVMe — NIST SP 800-88 clear-vs-purge-vs-destroy — and therefore whether a part can be resold at all, because at frontier sites the weights and the residual value point in opposite directions.
Resale-vs-redeploy-vs-shred for each retired part, scored on a 45-day decaying resale window against the secondary-market price you can actually realize, the chain-of-custody risk, and any contractual or security obligation to destroy.
How you keep capacity continuous through the swap — rolling, hall-by-hall migration that protects goodput and the power envelope, versus a forklift cut-over that strands revenue while the new generation lands.

Every earlier chapter in Part 14 keeps the asset alive; this one decides when to kill it, and what to do with the body. Refresh is where the depreciation debate from Chapter 1.8 stops being a line on a pro-forma and becomes a loading dock full of two-generations-old accelerators that are simultaneously a security liability, an environmental obligation, and — if you handle them right — a meaningful slice of recovered capital. The fork that runs through the whole chapter is the same one that runs through AI-infrastructure finance: is a retired GPU an asset that cascades to a lower-value workload and earns its way to the book-life, or is it a depreciating liability you need off your floor before it costs more to keep than it is worth? The honest answer is workload- and site-specific, and getting it wrong in either direction is expensive.

This is refresh as execution, deliberately distinct from refresh as economics. The canonical home for the 2–3-year-economic-vs-5–6-year-book-life argument, the residual-value evidence, and the training-to-inference cascade as a depreciation defense is Chapter 1.8 — we do not re-litigate it here. What this chapter owns is the operational consequence: the refresh-trigger decision, the decommissioning-and-sanitization workflow, the ITAD/resale channel and the circular economy it feeds, and the migration discipline that keeps the factory earning while you swap its engines. The decommissioning of the facility itself — generators, BESS, coolant, the slab — is a separate lifecycle stage in Chapter 14.10; here we retire IT, not concrete.

What actually triggers a refresh

There is no single refresh clock. There are four, they fire at different times, and the one you let dominate quietly sets your residual-value outcome. Most operators inherit a default ("refresh on the book schedule") without realizing it is a choice with downstream cost.

Calendar trigger. Refresh when the depreciation schedule says the asset is fully written down. Simple, auditable, and the implicit default for anyone who let accounting set the cadence. Its failure mode is that the silicon does not obey the ledger: a part can be economically dead two years before its book life ends, or perfectly productive two years after.

Efficiency trigger. Refresh when the next generation's perf-per-watt makes the incumbent uneconomic to run against the power it consumes. This is the trigger that matters most in a power-bound world: when megawatts are the scarce input, the question is not "does this GPU still work" but "is this the highest-value use of this megawatt." A Hopper-class rack drawing ~40 kW that a Blackwell rack at ~130 kW or a Rubin-class rack beyond that can out-earn per kW is a refresh candidate even if it is fault-free, because the opportunity cost is measured in stranded interconnection capacity you cannot get back. → density trajectory in Chapter 14.7.

Failure-rate trigger. Refresh when the fleet slice crosses the wear-out knee of the bathtub curve and the annualized failure rate plus RMA logistics cost more in lost goodput than redeployment is worth. The spares and RMA machinery in Chapter 14.6 is what tells you where that knee is; refresh is the exit when sparing a generation stops paying.

Density-ramp trigger. Refresh to capture the revenue-per-MW step of a new generation — the same density-ramp logic that Chapter 1.1 treats as a scoping decision, now executed mid-life. This only works if the irreversible substrate (floor loading, water, electrical headroom) was reserved for it; a hall scoped for 40 kW air cannot absorb a 130 kW liquid generation, and the refresh becomes a rebuild.

The refresh-timing fork: pull early and strand residual, or hold late and strand the megawatt

The central refresh decision is a two-sided error. Pull too early — chasing every generation on an 18–24-month architecture cadence — and you crystallize a residual loss on a part that still had cascade life in it, while paying the full capex of the successor before the market has deflated it. Hold too late and you burn a scarce megawatt running an inefficient part, you miss the resale window while the secondary price decays, and you accumulate a wear-out tail of failures that eats goodput. The trigger you choose decides which error you make. The discipline is to refresh on the efficiency/density trigger for the power-bound frontier slice and on the failure/calendar trigger for the cascaded tail — not to apply one clock to the whole fleet. This is the operational expression of the depreciation-life choice in Chapter 1.8: if you underwrote the short economic life, you are a frequent refresher and must have an industrialized ITAD channel; if you underwrote the long book life, you are betting the cascade holds, and refresh execution is that bet being placed.

The cascade as refresh execution

The training-to-inference cascade is the mechanism that, if it holds, stretches economic life toward book life — and refresh is where the cascade is either executed or revealed to be a story. A part retired from frontier pre-training does not have to leave the building. It can step down: from synchronous pre-training, to post-training/RL rollouts and fine-tuning, to latency-relaxed online inference, to batch and internal workloads, earning revenue at each rung. Each step down relaxes the requirement the part can no longer meet (top-end perf-per-watt, the largest scale-up domain) and exploits what it still does well (raw memory bandwidth, adequate FLOPS for a smaller model).

The catch is that the cascade is finite and power-gated. There is only so much lower-tier demand, and in a power-bound site every cascaded part is occupying a megawatt that a newer part could use more profitably. So the real refresh decision per part is a three-way: redeploy it down the cascade (if you have the lower-tier demand and the megawatt is not the binding constraint), resell it into the secondary market (if the residual price beats the cascade value net of the power it would consume), or retire/shred it (if security obligations forbid resale or the part is genuinely uneconomic to run). The cascade defense of the long book life is only honest if a real, liquid resale market exists to absorb the parts that do not cascade internally — which is exactly the secondary-market-depth question flagged as load-bearing in Chapter 1.8.

The decommissioning workflow

Decommissioning an IT asset at scale is a chain-of-custody discipline, not a teardown. The workflow is roughly invariant across operators, and skipping a step is how a retired GPU becomes a data-breach headline or a stranded-capital line. The sequence: de-provision (drain workloads, remove from the scheduler and fabric, update the CMDB/DCIM asset record); power down and physically extract (de-cable, drain liquid loops on DLC racks, palletize); sanitize all data-bearing media to the chosen NIST 800-88 level with a certificate of sanitization per asset; account and tag every serial through a chain-of-custody manifest; route each part to redeploy, resale, or destruction; and document the disposition for audit, environmental compliance, and — increasingly — embodied-carbon and Scope 3 reporting. The asset-record discipline is the spine of all of it: an untracked serial is an unsanitized serial and an unrecoverable dollar.

The weights-vs-residual collision: sanitization is where security and capital recovery fight

At a frontier site the two most valuable things in a retired accelerator point in opposite directions. The residual value wants the part intact, attested, and sellable. The model weights and tenant data that may have transited HBM and local NVMe want the part unrecoverable. NIST SP 800-88 frames the choice as Clear (logical overwrite — fast, preserves resale, defeats casual recovery), Purge (cryptographic erase or firmware sanitize — defeats laboratory recovery, usually preserves resale), or Destroy (shred/incinerate — defeats everything and forecloses resale). GPU HBM is the hard case: it is volatile but a GPU reassigned across tenants must be scrubbed (memory-scrub on reassignment) to prevent cross-tenant leakage, and there is no universally accepted, attestable Purge for on-package HBM the way there is a cryptographic-erase command for an SED NVMe drive. The consequence is a real fork: frontier operators frequently choose Destroy for the highest-sensitivity parts and write off the residual entirely — which is precisely why the secondary market sees so few hyperscaler-retired frontier GPUs, and why the resale-residual evidence underwriting the long depreciation life is thinner than the headline retention numbers suggest. → weight protection in Chapter 11.8; confidential computing and HBM remanence in Chapter 11.5.

NIST SP 800-88 sanitization levels → what survives, and what it costs you

Level	Method (examples)	Defeats	Resale impact	Best fit
Clear	Logical overwrite / block erase; NVMe format	Casual / standard recovery tools	Preserved — part stays sellable	Low-sensitivity media, internal redeploy
Purge	Cryptographic erase (SED), firmware sanitize, degauss (magnetic only)	Laboratory / forensic recovery	Usually preserved if attestable	SED NVMe; the default for resellable data drives
Destroy	Shred, disintegrate, incinerate, melt	All recovery, all techniques	Forecloses resale — residual written off	Frontier weights, classified/sovereign, un-attestable HBM
Verify (overlay)	Sampling / full read-back + certificate per asset	Process failure and silent non-erasure	Enables sale by proving the sanitization	Mandatory overlay on Clear/Purge for any resale

Mapping of sanitization choice to recovery resistance, resale impact, and cost. GPU/HBM is the constrained case: no universally attestable Purge for on-package HBM, so high-sensitivity parts often go to Destroy. Synthesis of NIST SP 800-88 Rev.1 and ITAD practice, 2025-2026.

ITAD, resale and the circular economy

IT Asset Disposition is the industrialized channel that turns a retired part into either recovered capital or certified e-waste. The 2026 reality is that the compressed refresh cadence — architecture generations landing every 18–36 months against the legacy 5–7-year enterprise cycle — is flooding the secondary market with a volume of accelerators it has never had to absorb, which is both an opportunity (cheap inference capacity for smaller buyers) and a risk (residual prices fall as supply rises). The lifecycle-economics lever that practitioners actually pull is the resale window: roughly 35–50% capital recovery if a part is moved within ~45 days of retirement, decaying steadily after that as the next generation deflates the price. A retired part is a melting ice cube; ITAD velocity is a first-class refresh KPI, not a back-office afterthought.

Vendor selection is a compliance decision with teeth. The certifications that matter — R2v3 and e-Stewards for responsible recycling and chain-of-custody, ISO 14001 for environmental management, and increasingly NAID AAA for data destruction — are what let you prove, to an auditor or a regulator, that a sanitized part was actually sanitized and a destroyed part was actually destroyed and not quietly resold into a market it should never have entered. The downside of skipping certified ITAD is not abstract: it is a tracked serial surfacing on the secondary market with recoverable tenant data on it, or an e-waste-dumping liability under tightening jurisdictional rules. The circular-economy framing — refurbish, reuse, recover materials, certified-destroy only as last resort — is also where refresh meets the embodied-carbon argument: a part kept in service one more cascade rung is embodied carbon you do not have to re-manufacture. → embodied carbon and circularity in Chapter 15.6.

Disposition fork: redeploy vs resell vs destroy — the per-part decision

Disposition	Capital outcome	Time sensitivity	Security posture	When it wins
Redeploy (cascade)	Avoided capex on a lower tier	Low — value is internal	Stays inside your trust boundary; scrub on reassignment	You have lower-tier demand and the megawatt is not the binding constraint
Resell (certified ITAD)	~35–50% recovery if sold within ~45 days	High — price decays per generation	Requires attestable Clear/Purge + verify	Residual beats cascade value net of the power the part would consume
Destroy (shred)	Residual written off; e-waste recovery only	Low — but do it promptly to close liability	Defeats all recovery; the safe default for weights	Security/sovereign obligation, or un-attestable HBM, forbids resale
Hold (do nothing)	Negative — opex + decaying residual	Worst — the melting-ice-cube case	Growing liability the longer it sits unsanitized	Almost never; it is the default that costs the most

Per-part disposition scorecard at refresh. 'Recovery' is fraction of residual realized as cash; redeploy realizes value as avoided capex, not cash. Resale recovery and window are 2026 ITAD-practice ranges; figures CONTESTED and supply-sensitive. Synthesis of ITAD-vendor and secondary-market data, 2025-2026.

5-6 yr vs 2-3 yr

GPU accounting book life vs frontier-economic life — the trigger this chapter executes (CONTESTED)

2026CNBC / Stanley Laman / SemiAnalysis

18-36 mo

compressed refresh / architecture cadence vs legacy 5-7 yr enterprise cycle

2025Uptime / domain synthesis

~35-50%

capital recovery on resale within a ~45-day window; decays per generation thereafter

2026ITAD-vendor / secondary-market synthesis

~$7-22k

used H100 secondary price by age (<1 yr ~$18-25k; 2+ yr ~$7-12k); ~85% off 2023 peak

2026Compute Exchange / Hashrate Index

~60-83%

H100 value retained at 18 months — but rental rates fell ~64-75% from peak (CONTESTED)

2025Hashrate Index / CNBC synthesis

Meta 4→5.5 / AWS 6→5 yr

hyperscalers moved depreciation life in OPPOSITE directions — the number is unsettled

2025HBS case / company filings

~$176B

estimated understated AI D&A 2026-2028 if economic life is right (CONTESTED)

2026Burry / secondary analyses; filings

NIST SP 800-88

the governing media-sanitization standard: Clear / Purge / Destroy + Verify

2014 (current)NIST SP 800-88 Rev.1

Deep dive: why HBM has no clean Purge, and what that does to residual value

The sanitization story is simple for the NVMe drives in an AI node and genuinely hard for the accelerator itself, and the hard part is exactly where the data of interest lives. A self-encrypting NVMe SSD supports cryptographic erase: discard the media-encryption key and every block is instantly, verifiably unrecoverable — a NIST 800-88 Purge in milliseconds, attestable, resale-preserving. GPU HBM has no equivalent standardized, attestable command. It is volatile, so it does not retain data across a power cycle in the way a disk does; but in a live multi-tenant fleet the threat is not cold-boot remanence years later, it is a GPU being reassigned from one tenant to the next with the previous tenant's weights or activations still resident in HBM. The mitigation is a memory scrub on reassignment (and, in confidential-computing modes, key derivation that ties memory contents to a session) — an operational control during life, not a decommissioning Purge you can hand an auditor a certificate for.

The consequence at end-of-life is a forced choice. To resell a frontier accelerator you must convince a buyer — and your own security team — that nothing recoverable remains, and for on-package HBM there is no clean, standardized way to prove that to the Purge bar. So the high-sensitivity path collapses to Destroy: shred the part, write off the residual, and accept that the secondary market will never see it. This is why so few hyperscaler-retired frontier GPUs appear in the used market, and why the residual-retention figures that underwrite the long depreciation life are skewed by the parts that do sell (lower-sensitivity, off-frontier inventory) rather than the frontier parts that get shredded. The cascade-and-resale defense of the 5–6-year book life implicitly assumes a deep resale market; the HBM-sanitization problem is one of the structural reasons that market is thinner than the headline numbers imply. → Chapter 11.5 (GPU confidential computing, HBM remanence); Chapter 11.8 (weight protection); residual-value economics in Chapter 1.8.

Migration and capacity continuity through the swap

On a live campus, refresh is a migration that must protect goodput and the power envelope while it happens, not a single maintenance window. The cost the swap strands is revenue: every megawatt taken offline to receive the new generation is a megawatt not earning against a depreciation clock that is still running. The dominant pattern is therefore rolling, hall-by-hall (or row-by-row) migration: drain and de-provision one fault domain, swap it, re-commission it, and only then move to the next — so the cluster never loses more than one domain's worth of capacity, and the new generation is validated against the live workload before the old generation is fully retired. The alternative — a forklift cut-over of a whole hall at once — is faster on paper but strands the entire hall's revenue during the transition and concentrates commissioning risk into a single window, which is exactly the high-risk density-step-up that re-commissioning discipline exists to de-risk.

The power envelope is the constraint that makes this delicate. A denser successor generation does not just need more total megawatts; during the overlap window you may be running both generations, and the facility power and cooling plant must absorb the transient. This is why the migration plan and the capacity/power plan are the same plan: the order in which you swap halls is dictated by where you have stranded power headroom to land the denser part, and a refresh executed without that headroom reserved becomes a rebuild of the power chain, not a swap of the silicon. The re-commissioning required to certify the new density step is the highest-risk event on a live campus, and it is owned by Chapter 14.10's neighbor in the continuous-commissioning discipline. → capacity and power management in Chapter 14.7; the density-step-up as the top re-commissioning risk is treated in the continuous/re-commissioning chapter of Part 14.

Refresh is the bet being placed, not the bet being made

The depreciation life you chose in Chapter 1.8 is a forecast; refresh execution is where that forecast is settled in cash. If you underwrote the long book life, every refresh is you staking that the cascade absorbs the retired part and that a liquid resale market clears the remainder — and the first time you shred a hall of frontier GPUs for sanitization reasons and write off the residual, you have learned that the bet had a hidden cost the pro-forma did not price. If you underwrote the short economic life, you are a frequent refresher who must have an industrialized, certified ITAD channel and a 45-day resale discipline, because your return depends on actually realizing the residual you assumed. Either way, the refresh organization is not a back-office function — it is the team that converts a depreciation assumption into a number, and the gap between the assumed residual and the realized residual is where stranded-asset risk lives. → the firm-level objective function in Chapter 1.8; the sector-macro view in Chapter 16.4.

Deep dive: the 45-day resale window as an operational deadline, not a market observation

The ~35–50%-recovery-within-45-days figure is usually quoted as a market fact; it is more useful read as an operational deadline that the decommissioning organization either hits or misses. The residual decays for two compounding reasons. First, the secondary GPU price falls as each new generation lands and as the refresh wave floods supply — the part is worth less every week simply because the market is. Second, a part sitting on a loading dock un-sanitized and un-manifested is accruing opex (floor space, security, insurance) and accruing liability (every day it is not sanitized is a day its data is exposed), so its net realizable value falls faster than the raw market price.

The practical consequence is that resale velocity is a process-design problem, and the bottleneck is almost always sanitization and chain-of-custody, not finding a buyer. The operators who realize the residual they underwrote are the ones who have pre-arranged certified-ITAD capacity, pre-negotiated resale channels, and a sanitization pipeline that can clear a hall's worth of parts inside the window — so that the part is manifested, sanitized-and-verified, and on a buyer's dock before the price has moved much. The operators who miss it are the ones who treat decommissioning as something that happens after the exciting work of installing the new generation, by which point most of the residual has decayed away. The lesson generalizes: in a fast-deflating asset class, disposition velocity is a unit-economics lever on par with utilization, and it is one of the few that an operations team controls outright. → spares/RMA logistics that feed the same reverse pipeline in Chapter 14.6; the residual-shock downside case in Chapter 1.8.

The economic argument this chapter executes — the 2–3-year-economic-vs-5–6-year-book-life debate, the residual-value evidence, the training-to-inference cascade as a depreciation defense, and the residual-shock and secondary-market-depth downside cases — is the canonical content of Chapter 1.8, and the density-ramp that refresh tries to capture mid-life is scoped in Chapter 1.1. The spares and RMA logistics that feed the same reverse-supply pipeline, and that tell you where the failure-rate refresh trigger fires, are in Chapter 14.6; the capacity and power management that the migration plan must respect is Chapter 14.7. The sanitization collision with security is treated from the security side in Chapter 11.5 (HBM remanence, confidential computing) and Chapter 11.8 (weight protection). The circular-economy and embodied-carbon framing of keeping a part in service one more rung lives in Chapter 15.6. Decommissioning the facility itself — generators, BESS, coolant, slab, restoration obligations — is the separate lifecycle stage in Chapter 14.10. The sector-macro altitude on whether the whole fleet is over- or under-depreciated is Chapter 16.4.