Chapter 14.9
Hardware Refresh, Depreciation Strategy, Decommissioning & ITAD
Refresh is the moment the depreciation assumption you underwrote in Chapter 1.8 stops being an accounting choice and becomes a physical operation — and how you execute it (when you pull the part, where it cascades, how you sanitize it, who you sell or shred it through) decides whether the residual value the whole financing case rests on is real money or a write-down with extra steps.
What you'll decide here
- What triggers refresh for a given fleet slice — calendar (the book-life schedule), efficiency (perf-per-watt against the next generation), failure-rate (the wear-out knee), or power (a denser part that earns more per scarce megawatt) — because the trigger you pick decides whether you refresh too early and strand residual value or too late and strand the interconnection slot.
- Whether a retired accelerator cascades (frontier-training → fine-tuning → online inference → batch/internal) or exits to ITAD — the single decision that determines whether the 5–6 year book life is honest or fiction, and the canonical economic argument lives in Chapter 1.8.
- Your data-sanitization standard for GPU/HBM and NVMe — NIST SP 800-88 clear-vs-purge-vs-destroy — and therefore whether a part can be resold at all, because at frontier sites the weights and the residual value point in opposite directions.
- Resale-vs-redeploy-vs-shred for each retired part, scored on a 45-day decaying resale window against the secondary-market price you can actually realize, the chain-of-custody risk, and any contractual or security obligation to destroy.
- How you keep capacity continuous through the swap — rolling, hall-by-hall migration that protects goodput and the power envelope, versus a forklift cut-over that strands revenue while the new generation lands.
Every earlier chapter in Part 14 keeps the asset alive; this one decides when to kill it, and what to do with the body. Refresh is where the depreciation debate from Chapter 1.8 stops being a line on a pro-forma and becomes a loading dock full of two-generations-old accelerators that are simultaneously a security liability, an environmental obligation, and — if you handle them right — a meaningful slice of recovered capital. The fork that runs through the whole chapter is the same one that runs through AI-infrastructure finance: is a retired GPU an asset that cascades to a lower-value workload and earns its way to the book-life, or is it a depreciating liability you need off your floor before it costs more to keep than it is worth? The honest answer is workload- and site-specific, and getting it wrong in either direction is expensive.
This is refresh as execution, deliberately distinct from refresh as economics. The canonical home for the 2–3-year-economic-vs-5–6-year-book-life argument, the residual-value evidence, and the training-to-inference cascade as a depreciation defense is Chapter 1.8 — we do not re-litigate it here. What this chapter owns is the operational consequence: the refresh-trigger decision, the decommissioning-and-sanitization workflow, the ITAD/resale channel and the circular economy it feeds, and the migration discipline that keeps the factory earning while you swap its engines. The decommissioning of the facility itself — generators, BESS, coolant, the slab — is a separate lifecycle stage in Chapter 14.10; here we retire IT, not concrete.
What actually triggers a refresh
There is no single refresh clock. There are four, they fire at different times, and the one you let dominate quietly sets your residual-value outcome. Most operators inherit a default ("refresh on the book schedule") without realizing it is a choice with downstream cost.
Calendar trigger. Refresh when the depreciation schedule says the asset is fully written down. Simple, auditable, and the implicit default for anyone who let accounting set the cadence. Its failure mode is that the silicon does not obey the ledger: a part can be economically dead two years before its book life ends, or perfectly productive two years after.
Efficiency trigger. Refresh when the next generation's perf-per-watt makes the incumbent uneconomic to run against the power it consumes. This is the trigger that matters most in a power-bound world: when megawatts are the scarce input, the question is not "does this GPU still work" but "is this the highest-value use of this megawatt." A Hopper-class rack drawing ~40 kW that a Blackwell rack at ~130 kW or a Rubin-class rack beyond that can out-earn per kW is a refresh candidate even if it is fault-free, because the opportunity cost is measured in stranded interconnection capacity you cannot get back. → density trajectory in Chapter 14.7.
Failure-rate trigger. Refresh when the fleet slice crosses the wear-out knee of the bathtub curve and the annualized failure rate plus RMA logistics cost more in lost goodput than redeployment is worth. The spares and RMA machinery in Chapter 14.6 is what tells you where that knee is; refresh is the exit when sparing a generation stops paying.
Density-ramp trigger. Refresh to capture the revenue-per-MW step of a new generation — the same density-ramp logic that Chapter 1.1 treats as a scoping decision, now executed mid-life. This only works if the irreversible substrate (floor loading, water, electrical headroom) was reserved for it; a hall scoped for 40 kW air cannot absorb a 130 kW liquid generation, and the refresh becomes a rebuild.
The cascade as refresh execution
The training-to-inference cascade is the mechanism that, if it holds, stretches economic life toward book life — and refresh is where the cascade is either executed or revealed to be a story. A part retired from frontier pre-training does not have to leave the building. It can step down: from synchronous pre-training, to post-training/RL rollouts and fine-tuning, to latency-relaxed online inference, to batch and internal workloads, earning revenue at each rung. Each step down relaxes the requirement the part can no longer meet (top-end perf-per-watt, the largest scale-up domain) and exploits what it still does well (raw memory bandwidth, adequate FLOPS for a smaller model).
The catch is that the cascade is finite and power-gated. There is only so much lower-tier demand, and in a power-bound site every cascaded part is occupying a megawatt that a newer part could use more profitably. So the real refresh decision per part is a three-way: redeploy it down the cascade (if you have the lower-tier demand and the megawatt is not the binding constraint), resell it into the secondary market (if the residual price beats the cascade value net of the power it would consume), or retire/shred it (if security obligations forbid resale or the part is genuinely uneconomic to run). The cascade defense of the long book life is only honest if a real, liquid resale market exists to absorb the parts that do not cascade internally — which is exactly the secondary-market-depth question flagged as load-bearing in Chapter 1.8.
The decommissioning workflow
Decommissioning an IT asset at scale is a chain-of-custody discipline, not a teardown. The workflow is roughly invariant across operators, and skipping a step is how a retired GPU becomes a data-breach headline or a stranded-capital line. The sequence: de-provision (drain workloads, remove from the scheduler and fabric, update the CMDB/DCIM asset record); power down and physically extract (de-cable, drain liquid loops on DLC racks, palletize); sanitize all data-bearing media to the chosen NIST 800-88 level with a certificate of sanitization per asset; account and tag every serial through a chain-of-custody manifest; route each part to redeploy, resale, or destruction; and document the disposition for audit, environmental compliance, and — increasingly — embodied-carbon and Scope 3 reporting. The asset-record discipline is the spine of all of it: an untracked serial is an unsanitized serial and an unrecoverable dollar.
| Level | Method (examples) | Defeats | Resale impact | Best fit |
|---|---|---|---|---|
| Clear | Logical overwrite / block erase; NVMe format | Casual / standard recovery tools | Preserved — part stays sellable | Low-sensitivity media, internal redeploy |
| Purge | Cryptographic erase (SED), firmware sanitize, degauss (magnetic only) | Laboratory / forensic recovery | Usually preserved if attestable | SED NVMe; the default for resellable data drives |
| Destroy | Shred, disintegrate, incinerate, melt | All recovery, all techniques | Forecloses resale — residual written off | Frontier weights, classified/sovereign, un-attestable HBM |
| Verify (overlay) | Sampling / full read-back + certificate per asset | Process failure and silent non-erasure | Enables sale by proving the sanitization | Mandatory overlay on Clear/Purge for any resale |
ITAD, resale and the circular economy
IT Asset Disposition is the industrialized channel that turns a retired part into either recovered capital or certified e-waste. The 2026 reality is that the compressed refresh cadence — architecture generations landing every 18–36 months against the legacy 5–7-year enterprise cycle — is flooding the secondary market with a volume of accelerators it has never had to absorb, which is both an opportunity (cheap inference capacity for smaller buyers) and a risk (residual prices fall as supply rises). The lifecycle-economics lever that practitioners actually pull is the resale window: roughly 35–50% capital recovery if a part is moved within ~45 days of retirement, decaying steadily after that as the next generation deflates the price. A retired part is a melting ice cube; ITAD velocity is a first-class refresh KPI, not a back-office afterthought.
Vendor selection is a compliance decision with teeth. The certifications that matter — R2v3 and e-Stewards for responsible recycling and chain-of-custody, ISO 14001 for environmental management, and increasingly NAID AAA for data destruction — are what let you prove, to an auditor or a regulator, that a sanitized part was actually sanitized and a destroyed part was actually destroyed and not quietly resold into a market it should never have entered. The downside of skipping certified ITAD is not abstract: it is a tracked serial surfacing on the secondary market with recoverable tenant data on it, or an e-waste-dumping liability under tightening jurisdictional rules. The circular-economy framing — refurbish, reuse, recover materials, certified-destroy only as last resort — is also where refresh meets the embodied-carbon argument: a part kept in service one more cascade rung is embodied carbon you do not have to re-manufacture. → embodied carbon and circularity in Chapter 15.6.
| Disposition | Capital outcome | Time sensitivity | Security posture | When it wins |
|---|---|---|---|---|
| Redeploy (cascade) | Avoided capex on a lower tier | Low — value is internal | Stays inside your trust boundary; scrub on reassignment | You have lower-tier demand and the megawatt is not the binding constraint |
| Resell (certified ITAD) | ~35–50% recovery if sold within ~45 days | High — price decays per generation | Requires attestable Clear/Purge + verify | Residual beats cascade value net of the power the part would consume |
| Destroy (shred) | Residual written off; e-waste recovery only | Low — but do it promptly to close liability | Defeats all recovery; the safe default for weights | Security/sovereign obligation, or un-attestable HBM, forbids resale |
| Hold (do nothing) | Negative — opex + decaying residual | Worst — the melting-ice-cube case | Growing liability the longer it sits unsanitized | Almost never; it is the default that costs the most |
Deep dive: why HBM has no clean Purge, and what that does to residual value
The sanitization story is simple for the NVMe drives in an AI node and genuinely hard for the accelerator itself, and the hard part is exactly where the data of interest lives. A self-encrypting NVMe SSD supports cryptographic erase: discard the media-encryption key and every block is instantly, verifiably unrecoverable — a NIST 800-88 Purge in milliseconds, attestable, resale-preserving. GPU HBM has no equivalent standardized, attestable command. It is volatile, so it does not retain data across a power cycle in the way a disk does; but in a live multi-tenant fleet the threat is not cold-boot remanence years later, it is a GPU being reassigned from one tenant to the next with the previous tenant's weights or activations still resident in HBM. The mitigation is a memory scrub on reassignment (and, in confidential-computing modes, key derivation that ties memory contents to a session) — an operational control during life, not a decommissioning Purge you can hand an auditor a certificate for.
The consequence at end-of-life is a forced choice. To resell a frontier accelerator you must convince a buyer — and your own security team — that nothing recoverable remains, and for on-package HBM there is no clean, standardized way to prove that to the Purge bar. So the high-sensitivity path collapses to Destroy: shred the part, write off the residual, and accept that the secondary market will never see it. This is why so few hyperscaler-retired frontier GPUs appear in the used market, and why the residual-retention figures that underwrite the long depreciation life are skewed by the parts that do sell (lower-sensitivity, off-frontier inventory) rather than the frontier parts that get shredded. The cascade-and-resale defense of the 5–6-year book life implicitly assumes a deep resale market; the HBM-sanitization problem is one of the structural reasons that market is thinner than the headline numbers imply. → Chapter 11.5 (GPU confidential computing, HBM remanence); Chapter 11.8 (weight protection); residual-value economics in Chapter 1.8.
Migration and capacity continuity through the swap
On a live campus, refresh is a migration that must protect goodput and the power envelope while it happens, not a single maintenance window. The cost the swap strands is revenue: every megawatt taken offline to receive the new generation is a megawatt not earning against a depreciation clock that is still running. The dominant pattern is therefore rolling, hall-by-hall (or row-by-row) migration: drain and de-provision one fault domain, swap it, re-commission it, and only then move to the next — so the cluster never loses more than one domain's worth of capacity, and the new generation is validated against the live workload before the old generation is fully retired. The alternative — a forklift cut-over of a whole hall at once — is faster on paper but strands the entire hall's revenue during the transition and concentrates commissioning risk into a single window, which is exactly the high-risk density-step-up that re-commissioning discipline exists to de-risk.
The power envelope is the constraint that makes this delicate. A denser successor generation does not just need more total megawatts; during the overlap window you may be running both generations, and the facility power and cooling plant must absorb the transient. This is why the migration plan and the capacity/power plan are the same plan: the order in which you swap halls is dictated by where you have stranded power headroom to land the denser part, and a refresh executed without that headroom reserved becomes a rebuild of the power chain, not a swap of the silicon. The re-commissioning required to certify the new density step is the highest-risk event on a live campus, and it is owned by Chapter 14.10's neighbor in the continuous-commissioning discipline. → capacity and power management in Chapter 14.7; the density-step-up as the top re-commissioning risk is treated in the continuous/re-commissioning chapter of Part 14.
Deep dive: the 45-day resale window as an operational deadline, not a market observation
The ~35–50%-recovery-within-45-days figure is usually quoted as a market fact; it is more useful read as an operational deadline that the decommissioning organization either hits or misses. The residual decays for two compounding reasons. First, the secondary GPU price falls as each new generation lands and as the refresh wave floods supply — the part is worth less every week simply because the market is. Second, a part sitting on a loading dock un-sanitized and un-manifested is accruing opex (floor space, security, insurance) and accruing liability (every day it is not sanitized is a day its data is exposed), so its net realizable value falls faster than the raw market price.
The practical consequence is that resale velocity is a process-design problem, and the bottleneck is almost always sanitization and chain-of-custody, not finding a buyer. The operators who realize the residual they underwrote are the ones who have pre-arranged certified-ITAD capacity, pre-negotiated resale channels, and a sanitization pipeline that can clear a hall's worth of parts inside the window — so that the part is manifested, sanitized-and-verified, and on a buyer's dock before the price has moved much. The operators who miss it are the ones who treat decommissioning as something that happens after the exciting work of installing the new generation, by which point most of the residual has decayed away. The lesson generalizes: in a fast-deflating asset class, disposition velocity is a unit-economics lever on par with utilization, and it is one of the few that an operations team controls outright. → spares/RMA logistics that feed the same reverse pipeline in Chapter 14.6; the residual-shock downside case in Chapter 1.8.