Privacy on the Internet has always been a hot topic, whether you're discussing political dissidents using Tor to talk freely about their governments or about Facebook users worried about invasive advertising. As more and more personal information is put online (or sent into the cloud) there are always discussions about the erosion of personal privacy rights or the dilution of personal space. Those metaphors imply slow and progressive change, but with new techniques and research technologies it may turn out that traditional privacy rights have instead undergone sublimation, and are already slowly drifting on the wind.
The New York Times published this article last week on a group called the MIT Media Lab and a project they call Reality Mining. Reality Mining is a more specific form of data mining that uses massive amounts of data points from real people to learn about behaviors and trends within groups of people. From the Media Lab's home page:
Reality Mining defines the collection of machine-sensed environmental
data pertaining to human social behavior. This new paradigm of data
mining makes possible the modeling of conversation context, proximity
sensing, and temporospatial location throughout large communities of
individuals. Mobile phones (and similarly innocuous devices) are used
for data collection, opening social network analysis to new methods of
empirical stochastic modeling.
Pulling Data From All Around Us
Reality Mining takes advantage of the, now ubiquitous, mobile phone to study human behavior. In their current project the Media Lab traded 100 MIT students a brand new smartphone in exchange for complete transparency in their life: who the students talk to, what music they listen to, what applications they use on the phone and how they use them. In short, everything. The students are all assured that the data collected will be anonymized to protect their identity before being released.
Many people are willing to trade their privacy for certain benefits, for example giving up details about your personal life on a site like Myspace or Facebook in order to better stay in contact with old friends. Many people are also willing to trade certain rights in the interests of science, that is to participate in a study, and this study is certainly audacious. From the Reality Mining page:
The Reality Mining project represents the largest mobile phone
experiment ever attempted in academia. We are collecting an
unprecedented amount of data on human behavior and group interactions
that we plan on anonymizing and making available to the general
academic community. By the end of the experiment, this dataset will
contain over 500,000 hours (~60 years) of continuous data on daily
human behavior. Already we have been approached by over a dozen of
researchers in a wide range of fields (including epidemiology,
sociology, physics, artificial intelligence, and organizational
behavior) who are extremely eager to see how this unique data can
answer questions from their own discipline.
500,000 hours of continuous data. But let's go back to that promise. All information is scrubbed before being passed on, so there's no way of determining an individual's actions based on their usage, and individual privacy is preserved. But that seems to be something of a catch-22. These experiments are meant to study a person's actions to see how predictable they are, if it turns out their actions are extremely predictable, then I'll be able to identify a person based solely on their habits, given reasonable data and reasonable resources.
How about a practical example? In late 2006 AOL, in an attempt to reach out to the academic community, released an anonymized set of searches made through their service. The goal was to give academic researchers the chance to crunch some numbers on some real search data. The data released covered 20 million searches from 658,000 users, or an average of 30.39 searches per user. And, as it turns out, you can identify individuals by name from anonymized searches.
Privacy Tradeoffs
Of course, this is only relevant if the benefits to society are less than the damage that could be done by a loss of personal privacy. Dr. Thomas Malone, director of MIT's Center for Collective Intelligence, comments at the end of the article:
"For most of human history, people have lived in small tribes where
everything they did was known by everyone they knew,” Dr. Malone said.
“In some sense we’re becoming a global village. Privacy may turn out to
have become an anomaly."
I'm not an expert in sociology, so I hope this comment is in line, this seems like a textbook example of the naturalistic fallacy. Paraphrasing from the Wikipedia page on the naturalistic fallacy: "what is natural is inherently good or right, and that what is unnatural is bad or wrong". The implication from the statement seems to be that privacy is a very recent phenomenon, something that hasn't existed long and therefore shouldn't be expected.
For most of human history humanity, as we know it, hasn't existed, and it changes all the time. You could take that as a point for either side of the argument, but if we're truly becoming a global village, that is a single village with 6.4 billion people in it, that makes me desire my own personal privacy all the more. The idea of privacy is to give people their own space, a small piece of calm refuge from the outside world, whether that world is a village or a metropolis (or a modern-day superposition of both).
And while privacy is certainly a recent creation, the idea has certainly been around for some time. Stanford's Encyclopedia of Philosophy page on Privacy (and critiques of privacy) lists Aristotle's writings as the first major work on privacy, in which he separates the "public sphere of politics and
political activity, the polis, and the private or domestic
sphere of the family, the oikos, as two distinct spheres of
life".
Societal Benefits
Dr. Alex Pentland, of MIT's Media Lab, leans more heavily on the benefit that can be taken from these large data mining projects, while still trying to protect a user's privacy:
DR. PENTLAND says there are ways to avoid surveillance-society
pitfalls that lurk in the technology. For the commercial use of such
information, he has proposed a set of principles derived from English
common law to guarantee that people have ownership rights to data about
their behavior. The idea revolves around three principles: that you
have a right to possess your own data, that you control the data that
is collected about you, and that you can destroy, remove or redeploy
your data as you wish.
...
Citing the epidemic involving severe acute respiratory syndrome, or
SARS, in recent years, he said technology would have helped health
officials watch the movement of infected people as it happened,
providing an opportunity to limit the spread of the disease.
“If I could have looked at the cellphone records, it could have been
stopped that morning rather than a couple of weeks later,” he said.
“I’m sorry, that trumps minute concerns about privacy.”
Infectious disease tracking seems to be one of the more immediate uses of tracking individual movements and actions within a population. Using search trends Google released a new tool several weeks ago called Google Flu Trends. It turns out that you can track the flu and, presumably, other diseases as well based on what people are searching for in a given area.
The only real concern is with the final quote. When we're talking about other people's cellphone records and other people's personal data, privacy concerns always seem minute. English Common Law principles don't apply to the massively digital world we've created today. The right to possess your own data and data that is collected about you relies on the implicit assumption that you know data is being collected about you.
Over the past several years surveillance, whether in the form of video cameras or online click-tracking, has increased considerably. Most people have no idea what information they're sending to remote webservers, and what databases that data gets entered into or where that data goes. To think of the amount of time it would take to try and track down companies that collect information online to request data about yourself is mind boggling. Every website you visit has pieces of information about you: IP address, operating system, browser, what other sites you've visited recently, etc.
In a world where massive data mining and statistical research work to peer into the seemingly inscrutable, a system of explicit opt-out doesn't seem like it would have any benefit for personal privacy whatsoever. Though, if the choice is between a person's privacy and a person's life, in the form of stopping the early spread of a disease, who's to say that privacy is worth the trade?