Technology Radar

DeepSpeed

Published : Nov 05, 2025

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Nov 2025

Assess

DeepSpeed is a Python library that optimizes distributed deep learning for both training and inference. For training, it integrates technologies such as the Zero Redundancy Optimizer (ZeRO) and 3D parallelism to efficiently scale models across thousands of GPUs. For inference, it combines tensor, pipeline, expert and ZeRO parallelism with custom kernels and communication optimizations to minimize latency. DeepSpeed has powered some of the world's largest language models, including Megatron-Turing NLG (530B) and BLOOM (176B). It supports both dense and sparse models, delivers high system throughput and allows training or inference across multiple resource-constrained GPUs. The library integrates seamlessly with popular Hugging Face Transformers, PyTorch Lightning and Accelerate, making it a highly effective option for large-scale or resource-limited deep learning workloads.