Privacy-preserving record linkage (PPRL) using Bloom filter

This blip is not on the current edition of the Radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the RadarUnderstand more
Published: Nov 20, 2019
Nov 2019

Linking records from different data providers in the presence of a shared key is trivial. However, you may not always have a shared key; even if you do, it may not be a good idea to expose it due to privacy concerns. Privacy-preserving record linkage (PPRL) using Bloom filter (a space-efficient probabilistic data structure) is an established technique that allows probabilistic linkage of records from different data providers without exposing privately identifiable personal data. For example, when linking data from two data providers, each provider encrypts its personally identifiable data using Bloom filter to get cryptographic linkage keys and then sends them to you via a secure channel. Once data is received, the records can be linked by computing similarity scores between sets of cryptographic linkage keys from each provider. Among other techniques, we found PPRL using Bloom filters to be scalable for large data sets.