Data about data

“Metadata absolutely tells you everything about somebody’s life. If you have enough metadata you don’t really need content.”
-Stewart  Baker

Many people do not feel safe when they know that another entity is collecting information about them. If an individual collects information about you and uses this data to follow you, then it is called stalking. But if an organization wants to collect information about you and millions of others, often it is tagged as security measure. Organizations require a lot of storage space to store data. Earlier the storage space used to be in the form of large rooms filled with papers, now it is in the form of hard drives. With increasing amount of data about each person, organizations which want to store data realized that they will run out of space or will need to spend more to add space. They had to come up with a way to store information in limited space.

Individuals who care for their privacy might be relieved that organizations may after all not collect data about them. But this is where metadata enters the picture. Metadata is data about data. For example, data would be the recording of a phone call Bob made to Alice while the metadata would include the time of call, duration of call, location of your phone, location of your friends phone and of course both your phone numbers and name. Storing a phone call would require many megabytes of space while the metadata requires only few kilobytes of space. Many might question, what organizations can do with phone metadata without knowing the contents of the call.

Phone metadata provides organizations with enough information to map the life of people. Your location is recorded in order to provide continuous connectivity. The user does not need to make calls in order to give away their location. Now assume the organization only collects the location metadata of hundreds of users, it can trace the movement of people and basically draw a map. Looking at the overlapping location of multiple users at a particular time, it can be known with high probability that this group of users are associated with each other. If the location happens to be a football stadium owned by the local club, and the same group of users’ meet at the place every time there is a game being played, it can be known that they are followers of this particular club. Note that the users did not voluntarily mention about their favourite football club. But it was known just by tracking their location. This is a trivial example with only one kind of metadata. So much more can be known if other types of metadata such as online search terms are also known by the organization.

A lot of people would not mind letting others know of their favourite football club. But let us say that two people are in an intimate relationship and want to keep it a secret. They call each other often late in the night. The call duration is often more than an hour long. But after few months, the duration of calls reduces and after few more months the frequency of calls also becomes irregular. Using this kind of metadata, it can be assumed that there was a relationship between these two people which has ended. When there is metadata, who needs data?

Be Sociable, Share!

Comments are closed.

© 2011 TU Delft