DeepSpeed

Technology Radar

Published : Nov 05, 2025

Nov 2025

Assess

DeepSpeed is a Python library that optimizes distributed deep learning for both training and inference. For training, it integrates technologies such as the Zero Redundancy Optimizer (ZeRO) and 3D parallelism to efficiently scale models across thousands of GPUs. For inference, it combines tensor, pipeline, expert and ZeRO parallelism with custom kernels and communication optimizations to minimize latency. DeepSpeed has powered some of the world's largest language models, including Megatron-Turing NLG (530B) and BLOOM (176B). It supports both dense and sparse models, delivers high system throughput and allows training or inference across multiple resource-constrained GPUs. The library integrates seamlessly with popular Hugging Face Transformers, PyTorch Lightning and Accelerate, making it a highly effective option for large-scale or resource-limited deep learning workloads.

Download the PDF

English | Español | Português | 中文

Sign up for the Technology Radar newsletter

Subscribe now

Lösungen

Branchen

Digitale Veröffentlichungen und Tools

Alle Insights

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes