Technology Radar

GPTCache

Published : Sep 27, 2023

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Sep 2023

Assess

GPTCache is a semantic cache library for large language models (LLMs). We see the need for a cache layer in front of LLMs for two main reasons — to improve the overall performance by reducing external API calls and to reduce the cost of operation by caching similar responses. Unlike traditional caching approaches that look for exact matches, LLM-based caching solutions require similar or related matches for the input queries. GPTCache approaches this with the help of embedding algorithms to convert the input queries into embeddings and then use a vector datastore for similarity search on these embeddings. One drawback of such a design is that you may encounter false positives during cache hits or false negatives during cache misses, which is why we recommend you carefully assess GPTCache for your LLM-based applications.