Big Data and the Fall of Personally Identifiable Information

There has been no shortage of “Big Data” based start-ups in the last decade, and that trend shows no sign of slowing down. As computing power and sophistication continues to increase, the ability to process large sets of information has led to increasingly pointed insights about the sources of this data.

Take Target for example. When you pay for something at Target using a credit card, not only do you exchange your credit for physical goods, you also open a file. Target records your credit card number, sticks it to a virtual file and begins to fill that file with all sorts of information. Your purchase history is recorded: what you buy, when you bought it, how much you bought. Every time you respond to a survey, or call the customer help line or send them an email, Target is aware. Anytime you interact with Target, the data and meta-data that characterize that interaction are parsed carefully and stored as Target’s institutional knowledge. But it doesn’t end there. As diligent as Target may be in monitoring your interactions, there will inevitably be holes. But fear not! Instead of settling for an inadequate picture of who you are, Target can just buy the rest of it from the other people you do business with. “Target can buy data about your ethnicity, job history, the magazines you read, if you’ve ever declared bankruptcy or got divorced, the year you bought (or lost) your house, where you went to college, what kinds of topics you talk about online, whether you prefer certain brands of coffee, paper towels, cereal or applesauce, your political leanings, reading habits, charitable giving and the number of cars you own.”

And the results speak for themselves. By scrutinizing the mountains of data that it collects from countless individuals, patterns emerge. One particular creepy example involved Target finding out a teenage girl was pregnant before her father did.

But taking a step back, the increase in the specificity and pervasiveness of the insights that can be drawn from data analytics in the age of Big ~~Brother~~ Data poses, besides the issue of immediate discomfort at the individual level (the creepy factor), a broader legal problem.

Much of US data privacy law centers around the idea of Personally Identifiable Information (PII) and restricting its uses in certain contexts. However, the functionality of such a definition, one that places added weight on information that may distinguish an individual identity, relies on the existence of a practical distinction between data that is labeled PII and data that is not.

As Big Data continues to grow in both reach and sophistication, our information economy will start to approach a state in which no information falls outside of the definition of PII. The Target example makes clear that even seemingly benign information, when processed in conjunction with other “harmless” data, can reveal deeply personal facts about an individual. In a world where correlative findings have valid predictive value, the definition of PII is no longer effective in pursuing its goal of protecting individual rights to privacy.

———————————————————————————————————————————-

Charles Lu is an editor on the Michigan Telecommunications and Technology Law Review, and a member of the University Michigan Law School class of 2016.

Big Data and the Fall of Personally Identifiable Information

Submit a Comment Cancel reply