Technology Radar
Published : Oct 23, 2024
NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar.
Understand more
Oct 2024
Assess
LLMLingua 通过使用小型语言模型压缩提示,去除非必要的 token,从而提高大语言模型(LLM)的效率,并在性能损失最小的情况下实现这一目标。 这种方法使大语言模型(LLM)能够在有效处理较长提示的同时,保持推理和上下文学习能力,解决了成本效率、推理延迟和上下文处理等挑战。LLMLingua 与各种大语言模型兼容,无需额外训练,并支持如 LLamaIndex 等框架,它非常适合优化大语言模型的推理性能。