The Digital Woes of Public Records

Researcher Chris Vickery, discovered that 191 million voter records are available to the public. The information was in a database on the Internet, which seemed to be a collection of voter records and information compiled from public sources in various US states. While no Social Security numbers were present, names and dates of birth, as well as address and voting scores were in the database.

That’s scary, though potentially not a problem. A number of states publish voter data as public records. A few might have restrictions on the use of that data, but the fact that the data is available means it could be used maliciously, with overburdened authorities unlikely to prosecute anyone even if they’re caught.

This is one of those areas where our understanding and control of data hasn’t caught up to the digital age. It is one thing when public data is available to those that must physically search for it, or even query for singleton records. However data can reveal much more information, or even be used in new ways when large volumes of it is available. Now the ability to access every voter’s name, address, and date of birth could potentially be a problem.

I see so much data that we might have taken for granted in the past, thinking nothing of it’s visibility, being a problem in the future. When someone can gather large amounts of data, and store is cheaply, even accessible in something like a data lake, we may find that public data is problematic. When anyone can start to gather and combine lots of data from different sources, we might find that capability quite scary as potentially lots of information about individuals can be determined. We’ve seen anonymous data sets de-anonymized with the application merge of data from different sources.

I truly hope that we find ways to better protect and ensure privacy in the future, as all the capabilities and power that computing brings to data analysis truly has a dark side.

