Technology Radar

LLMLingua

Published : Oct 23, 2024

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Oct 2024

Assess

LLMLingua 通过使用小型语言模型压缩提示，去除非必要的 token，从而提高大语言模型（LLM）的效率，并在性能损失最小的情况下实现这一目标。这种方法使大语言模型（LLM）能够在有效处理较长提示的同时，保持推理和上下文学习能力，解决了成本效率、推理延迟和上下文处理等挑战。LLMLingua 与各种大语言模型兼容，无需额外训练，并支持如 LLamaIndex 等框架，它非常适合优化大语言模型的推理性能。

Download the PDF

English | Português

Sign up for the Technology Radar newsletter

Subscribe now

行业

数字出版物和工具

所有洞见

LLMLingua

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes