The Definitive Guide toAI Data Centers
Ask the Guide
GuideGlossaryQuantization

Quantization

Using lower numerical precision (FP8, FP4, INT8) to cut memory and boost throughput, trading some accuracy for speed.

← All terms