Technology Radar

AutoRound

Published : Nov 05, 2025

Not on the current edition

This blip is not on the current edition of the Radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar Understand more

Nov 2025

Assess

Intel's AutoRound is an advanced quantization algorithm for compressing large AI models, such as LLMs and vision language models (VLMs), with minimal loss of accuracy. It reduces model size to ultra-low bit widths (2–4 bits) using sign-gradient descent optimization and applies mixed bit widths across layers for optimal efficiency. This quantization process is also remarkably fast: You can quantize a 7-billion-parameter model in just minutes on a single GPU. Since AutoRound integrates with popular inference engines such as vLLM and Transformers, it's an attractive option for quantizing models.

Download the PDF

English | Português

Sign up for the Technology Radar newsletter

Subscribe now

Branchen

Digitale Veröffentlichungen und Tools

Alle Insights

AutoRound

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes