Top
image credit: Freepik

Microsoft’s ZeRO-2 Speeds up AI Training 10x

July 7, 2020

Via: InfoQ

Microsoft open-sourced Zero Redundancy Optimizer version 2 (ZeRO-2), a distributed deep-learning optimization algorithm that scales super-linearly with cluster size. Using ZeRO-2, Microsoft trained a 100-billion-parameter natural-language processing (NLP) model 10x faster than with previous distributed learning techniques.

Writing in a blog post, program manager Rangan Majumder and distinguished engineer Junhua Wang described the algorithm and their experiments. ZeRO-2 is part of Microsoft’s open-source DeepSpeed library for deep-learning training optimization. ZeRO-2 optimizes memory consumption during training, allowing for distributed training of models as large as 170 billion parameters.

Read More on InfoQ