ZeRO · Zero Redundancy Optimizer
A technique partitioning optimizer state, gradients and parameters across GPUs to remove memory redundancy in training.
A technique partitioning optimizer state, gradients and parameters across GPUs to remove memory redundancy in training.