Published : Apr 13, 2021
NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more
Trial Worth pursuing. It is important to understand how to build up this capability. Enterprises should try this technology on a project that can handle the risk.
Contextual bandits is a type of reinforcement learning that is well suited for problems with exploration/exploitation trade-offs. Named after "bandits," or slot machines, in casinos, the algorithm explores different options to learn more about expected outcomes and balances it by exploiting the options that perform well. We've successfully used this technique in scenarios where we've had little data to train and deploy other machine-learning models. The fact that we can add context to this explore/exploit trade-off makes it suitable for a wide variety of use cases including A/B testing, recommendations and layout optimizations.