Translucent Databases

Interesting article on oreillynet.com in response to the recent hacking of Yale student admission information by Princeton. The gist is that sensitive data that you don’t need to physically see, but only compare/search/parse should be put into your DB hashed. Excerpt:

“For example, what if a police department needs to build a database of sexual-assault victims that lets them identify trends but hides personal information? You could use a translucent database where the first column is the hash of the victim’s name, and the second column is a hash of their full address, and the third column is a hash of their block and street. You can now group incidents together by grouping entries with identical block hashes; you can see if the incidents refer to the same person by checking to see if those hashes are different.”

More information on translucent databases can be found here.

Leave a Reply

Your email address will not be published. Required fields are marked *