Technology Radar

模型蒸馏

Published : Apr 02, 2025

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Apr 2025

Trial

Scaling laws 是推动 AI 快速发展的关键原则之一，即更大的模型、更大的数据集和更多的计算资源能够带来更强大的 AI 系统。然而，消费级硬件和边缘设备往往缺乏运行大尺寸模型的能力，因此产生了对 模型蒸馏 的需求。

模型蒸馏 将知识从一个更大、更强的模型（教师模型）转移到一个更小、更高效的模型（学生模型）。这一过程通常包括从教师模型生成一个样本数据集，并对学生模型进行微调，以捕获其统计特性。与通过移除参数来压缩模型的剪枝技术或量化不同，蒸馏旨在保留领域特定的知识，同时将精度损失降到最低。此外，蒸馏还可以与量化结合使用，以进一步优化模型。

这种技术最早由 Geoffrey Hinton 等人提出，现已被广泛应用。一个显著的例子是 Qwen/Llama 的 DeepSeek R1 蒸馏版本，它们在小模型中保留了强大的推理能力。随着蒸馏技术的日益成熟，它已不再局限于研究实验室，而是被广泛应用于从工业项目到个人项目的各类场景中。像 OpenAI 和 Amazon Bedrock 这样的供应商也提供了详细的指南，帮助开发者蒸馏自己的小语言模型（SLMs）。我们认为，采用模型蒸馏技术能够帮助组织更好地管理 LLM 部署成本，同时释放本地设备上 LLM 推理的潜力。

行业

数字出版物和工具

所有洞见

模型蒸馏

Download the PDF

Sign up for the Technology Radar newsletter

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes