Using lower numerical precision (FP8, FP4, INT8) to cut memory and boost throughput, trading some accuracy for speed.
← All terms