Security

Is metadata collected by the government a threat to your privacy?

Seemingly unobtrusive digital bytes known as metadata have been vaulted to the tech media limelight. What is metadata, and why all of a sudden is it so interesting to so many?

I must confess; I normally pay little attention to metadata. But when the term metadata is plastered all over tech media, I begin to notice. So does President Obama; he referred to metadata during a recent press conference (Wall Street Journal):

[W]hat the intelligence community is doing is looking at phone numbers and durations of calls. They are not looking at people’s names, and they’re not looking at content. But by sifting through this so-called metadata, they may identify potential leads with respect to folks who might engage in terrorism.

President Obama only mentioned phone numbers and call duration. The tech media is advising that metadata is also associated with email and social-networking services. I think it’s time to dissect metadata, and see why it’s plastered all over the news.

What is metadata?

Metadata1.jpg
Every dictionary I checked defines metadata as data about data. I’m sorry, but that’s not very helpful. Things got clearer when I read Understanding Metadata, a white paper by the National Information Standards Organization (NISO):

“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.”

The paper then divides metadata into three categories:

  • Descriptive metadata: A resource for discovery and identification, including elements such as title, abstract, author, and keywords.
  • Structural metadata: A way to define how objects are put together, for example, how pages are ordered to form chapters.
  • Administrative metadata: Information to help manage a resource, such as when and how it was created, file types, and who has access.

Simply put, metadata summarizes information about data for the purpose of making that data easy to find and work with. Knowing that, the next step is to learn what kinds of metadata “organizations” find so interesting.

The Guardian

Guardian.jpg
The Guardian has been at the epicenter of the NSA surveillance controversy right from the start, and more than somewhat responsible for metadata’s meteoric rise into a household term. To its credit, The Guardian created a website to help those unsure about metadata determine if their Internet travels are leaving a trail of metadata crumbs or not. Below, I’ve listed some of the more interesting crumbs mentioned by The Guardian.

Metadata associated with emails:

  • Sender's name, email, and IP address
  • Recipient's name and email address
  • Date, time, and time zone
  • Unique identifier of email and related emails
  • Mail client login records with IP address
  • Mail client header formats
  • Subject of email

Metadata associated with mobile phones:

  • Phone number of every caller
  • Serial numbers of phones involved
  • Time of call
  • Duration of call
  • Location of each participant
  • Telephone calling card numbers

Metadata associated with Facebook:

  • Username and profile bio information including birthday, hometown, work history, and interests
  • Username and unique identifier
  • User subscriptions
  • User location
  • User device
  • Activity date, time, and time zone

Metadata associated with web browsers:

  • Activity including pages the user visits and when visited
  • User data and possibly user login details with auto-fill features
  • User IP address, internet service provider, device hardware details, operating system, and browser version
  • Cookies and cached data from websites

That’s quite a list of data points organizations could be hoovering up.

Proper context

I asked some acquaintances if they were bothered that their metadata is likely being archived in a database somewhere – "not so much" was the general consensus. One of my friends, who happens to be a database manager, did not agree with the rest. She pointed out it would be easy to search and manipulate such a database, transforming seemingly disparate pieces of information into meaningful connections.

That data-mining capability is also upsetting privacy advocates: including Kurt Opsahl of the Electronic Frontier Foundation. He explained why in this Gizmodo article: “What they are trying to say is that disclosure of metadata—the details about phone calls, without the actual voice -- aren’t a big deal, not something for Americans to get upset about if the government knows.”

Kurt then offered examples showing where bits of assembled data can create plausible conclusions:

  • They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don't know what was discussed.
  • They know you called a gynecologist, spoke for a half hour, and then called the local Planned Parenthood's number later that day. But nobody knows what you spoke about.

Kurt ends the Gizmodo article with the following insight:

[Y]our phone records -- oops, ‘so-called metadata’ -- can reveal a lot more about the content of your calls than the government is implying. Metadata provides enough context to know some of the most intimate details of your lives.

In the interest of being fair, I included an opposing opinion that was in the comment section of Kurt’s post:

There is the other side of the fence. They (government agencies) know:

  • A siding company called trying to sell me a new exterior.
  • Dan called, and we talked for three minutes.
  • They know I called up Pizza Planet, and spoke for 1 minute.

Personally, I don't care so much about the metadata they collect. It's what they plan on using it for that concerns me.

Final thoughts

Regardless of whether the capture of metadata is legal or not, metadata is being scarfed up wholesale. What concerns me about this is that conclusions based on the assimilated metadata appear to be mostly circumstantial. But, the impact on individuals is real just the same.

Update: For those musically inclined, Bill Shipper, "The Singing Analyst," sent me a music video he created -- My Metadata




About

Information is my field...Writing is my passion...Coupling the two is my mission.

48 comments
homeronline
homeronline

Most americans can not percieve a government doing evil to its citizens. History however, proves the opposite. And for any government, we can be sure that absolute power does absolutely corrupt. This is why our constitution explicitly limits the power of the federal government.  It is an extremely dangerous trade off to allow our personal privacy to be violated in the name of safety.

realvarezm
realvarezm

So it is ok for the government to have a naked picture of you, but its secure since is in black and white! Privacy is privacy try to excuse it as you please, but that is the fact and nobody should tolerate that the government or any other organization gather data about me or anybody else for whatever the purposes are. That is a breach to your privacy simple as that!

Adam_12345
Adam_12345

This is quite an interesting article. Event if the government has my metadata such as name, surname, IP address SSN etc. then where is this big issue? If they have these details and so on, the case is still wether they're going to undertake some actions agains me or not. If I create 1000 facebook fake accounts of other people does it mean that I'm a metadata thief ? 

daboochmeister
daboochmeister

Two comments:


- You're ignoring our constitutionally protected right to freedom from unwarranted search and seizure. Whether you care if the gov't knows every person you've corresponded with is irrelevant; constitutional law is clear, I have the right to have the gov't NOT know that unless they follow due process and demonstrate that there is a basis for suspecting me of a crime. The Constitution was specifically designed to act as a check on gov't's powers in such areas - and even the FISC is convinced the "metadata" gathering programs step well over a Constitutional line (and it all happened under a president who used to teach Constitutional law)


- Everyone stops at the documents that Snowden has released to date; he has more, and no doubt there are further revelations that extend the picture. E.g., does anyone believe that there are not other databases being spelunked, that also contain data that the gov't should not have access to without the burden of proof of obtaining a warrant?

jay swartz
jay swartz

The NSA is using misdirection and obfuscation to mollify us by couching the discussion in terms of metadata, knowing full well that the general public and lawmakers have no appreciation for the subtleties of data science. Metadata is indeed data that describes other data, but as has been noted, it's still data (aka personal information). Data that they feed into their algorithms to find obscure correlations.

P.S. I'm not related to tswartz if anyone was wondering.

shumphrey54
shumphrey54

The crux of why this is serious and something we should all be concerned about in found in the two "examples."  Take the first.  

 "They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don't know what was discussed."

On the surface, we all know what was discussed, right?  WRONG!!!  I blogged about this precise point back before the NSA database came to light.  My biggest fear is when someone looks at my browsing or phone metadata and creates the wrong picture.  The example about might have been a reporter working on an article.  But either way, accurate picture or inaccurate picture, this is very alarming.

My two cents.

sh

 

tswartz
tswartz

More smoke/mirrors. NSA admitting they actually have read "some" actual recording /emails, methinks they may have been referring the the number of things they looked at, not what they actually captured. I suspect that tory will come out on an extremely slow newsday, or buried w/ the next tragedy that captures mindshare for two weeks straight.

What is the fuss you may ask? After all, you willingly give this info to google and facebook, for example? Well, one facet ignored by the brainwashed who say stupid thinkgs like this is that you willingly gave up that info (even though google buries/obfuscates that you consented to sharing everything with them). NSA? I dont recall any clickware consenting to government spying on everything I do online. I do not consent, just to be clear.

Reality Bites
Reality Bites

Since there hasn't been a single word of truth out of the NSA since it was formed, only the truly subservient believe anything they say.

NSA = Needless Slimy Agency run by deviants, perverts and sicko's.

glwright1262
glwright1262

Wow. I work for a database company and all the stuff you listed would be classified as *transaction data*--not metadata.

midlantic
midlantic

According to testimony in Congress, our friends at the NSA inadvertently did admit that along with the metadata they also have the actual voice recordings, emails, etc.

marks
marks

What is all the fuss about?  There is no material difference between you information that you put in "the cloud" and what is in the NSA database.  The Google/Yahoo/etc. profiler databases about you and your habits are just as potentially damaging to you as what NAS is doing.  So, get over it.  When you go into a place that is uncontrolled, you are exposed to risk.  Don't like that.....don't go there!

tvmuzik
tvmuzik

Big Brother is watching us all, plain and simple. (yaaaaaaaawn. back to sleep)

oterrya
oterrya

I agree with your definition of metadata (data about data) but disagree with some of the extensions made in your article. 

To show a simple example, let us consider a UserID/password scenario. 
The metadata might consist of (as a possible example):
1. There are two elements
2. UserID: 1-40 alphabetic characters I length
3. Password: 6-36 characters in length, must be at least 1 Cap, 1 lower case, 1 number and 1 special char.
End Metadata

Note -- the Metadata describes the form of the data, it does not contain any of the data.

A file that contains phone numbers and call durations is at the very least not only metadata.  If the file also describes the data contained therein, it has metadata included.  The phone numbers and call durations are way beyond Metadata -- they are, can I say it, DATA.

Again, Metadata is data about data.  It describes field lengths, content type (char, num, etc) relationships, restrictions and the like.  As soon as you put in a phone number, address, time or any other similar piece, you now have data.

If these folks with large presences are now defining the term Metadata to include actual Data, I have no doubt that it will become the definition.  It was not always so.

clickyspinny
clickyspinny

This is old news guys.  Governments, companies and individuals have been scraping meta data for a very long time.  It's just done via scripts scouring the internet now, used to be done via the phonebook.  The gov isn't as bad as the private industry, scraping and scraping to market market market.

mcarr
mcarr

While discussions about metadata are interesting in their own right, you do realise that there's no reason to stop there, right? They likely have all the content of the calls, as well as pretty much anything else they might want. They have been pressuring companies for passwords - that ain't related to metadata.

Also, spare a thought for the rest of us around the world. At least Americans can blame themselves for their loss of privacy - who might I blame?

pipervt89
pipervt89

The government document describing metadata is also almost a decade old (@Copyright 2004). 

1.) It's not just "your guys" or "my guys" that are doing this, they are ALL looking.

 2.) They have been doing it for a long time

3.) What will we be referencing 10 years from now that isn't a big deal today given the direction we are heading? 

bluntsage
bluntsage

You might not care that the government knows you're on the phone with the pizza guy right now, but wait until some goofball in health and human services decides that metadata shows you're eating too much unhealthy food and decides to do something about it to "protect you" from yourself as well as save society money since it subsidizes your healthcare costs.

And that's not tinfoil hat paranoia talking. We now know that the NSA shares this metadata with other federal agencies such as the FBI, DEA BATF and IRS even when its not a terrorism-related issue. We also know that this country has a long history of distorting legal behaviors in order to suppress those who engage in them -- starting with the Alien and Sedition acts back in the 1790's and rolling forward to the "Red Scares" and McCarthyism the 20th century. All it takes is a political swing, some words about "protecting you" and society, and bang all that innocuous data you didn't care about becomes the evidence against you.

The real issue here is the government has no business cataloging the interactions between private citizens not suspected of doing anything illegal -- even if the parties concerned don't mind. If you wouldn't be fine with government agents coming in and making copies of all your mail, personal documents, photographing your possessions and cataloging the contents of your house -- not "reading" it per se, just collecting it as "metadata"  -- you shouldn't be fine with them collecting this information either.

The long term effect of this blanket surveillance will not be to thwart terror or make any of us safer -- those out to do harm have shown remarkable creativity and resourcefulness. What it will do is make people afraid to speak out and make us all less free.

"They who can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety." -- Benjamin Franklin


Datadad
Datadad

The government doesn't "think" they should have unfettered access, they DO as a result of a number of factors:

1. When phones were first being installed, MaBell(later) partnered with the govt. because of the huge costs of a nation-wide phone system.  2. Previous snooping programs, e.g., McCarthy's "Red Scare", Patriot Act, etc. that had public support(votes) at their creation.  3. The plethora of active space-based surveillance platforms.  4. A continuing "scare campaign" by one of our major political parties, resuting in continuing support for their backers'/constituents' obtrusive/unnecessary/wasteful programs.

Rant and shake your fists at the "police state surveillance society" all you want, but we have what we have as a result of the overwhelming majority of the population wanting "protection" as long as they don't think/aren't aware that they're scooped up in that net also. While I'm frequently uncomfortable/amazed by the general population's myopic existence, I unfortunately realize that genies only get put back into their bottles in stories. Metadata is essentially just a more efficient tool to achieve what used to take much more labor-intensive methods.

“The sense of security more frequently springs from habit than from conviction, and for this reason it often subsists after such a change in the conditions as might have been expected to suggest alarm. The lapse of time during which a given event has not happened, is, in this logic of habit, constantly alleged as a reason why the event should never happen, even when the lapse of time is precisely the added condition which makes the event imminent.” 

― George EliotSilas Marner    

mtnman28715
mtnman28715

The question isn't whether our privacy is being invaded, which it certainly is, the bigger question is why does the government think they should have unfettered access to our data whenever and for whatever purpose they deem necessary. They shouldn't be allowed that access. All this 'war on terror' and 'keeping us safe' is smoke and mirrors for a police state surveillance society that has way over stepped its bounds.

Shaba Shams
Shaba Shams

Nope as I'm not a criminal. But I will be concern if government employees use my data for their personal interest.

ahanse
ahanse

Welcome to the world of meta-data...

Love it or hate it: it is everywhere.....as you have found out the internet runs on meta-data as with most things in our life. Everything we do has some sort of data connected to it for us to proceed to our intended outcomes. Getting a job has the resume, what about applying for a loan, going to the doctor and what about purchasing a phone. Now we have the internet and it is more or less hidden from view. Previously we have been wowed by the content but eventually it took a hullabaloo by uninformed nincompoops to bring it to the fore which is ultimately a good thing in the end. The tax man, government agencies, the police, marketeers and others have volumes on each and every one of us legal abiding citizens to fill volumes when we finally pass away. Now we have the internet to enhance the collection, so get used to it.

 
homeronline
homeronline

@Adam_12345 Have you heard of the "Computer Fraud and Abuse Act"? In most cases, it is illegal to create fake accounts online. So, if you create 1000 fake Facebook accounts, you have committed 1,000 misdimeaners. And since the NSA is montoring this blog, they can now use that information against you. Either do what they say, or you will be prosecuted.

I hope you will do some research and open your mind to how truly dangerous it is to give the government the capability to monitor your every move and all your personal data. And even if you don't care about your personal privacy, I would hope that as an American, you would respect mine and help protect my constitutional rights.

Michael Kassner
Michael Kassner

@Adam_12345 

The big issue is what they do with the data, or if they make wrong assumptions from from using data that is for the most part circumstantial. 

manwe
manwe

@daboochmeister The problem with that position is electronic communications are regulated, and the question is to what degree government agencies have access to them. We have shifting sands now, and no clear answers. New technology leaves new openings for new court decisions. We are not dealing with "papers" any longer. Your phone can be searched by police without a warrant, and tablets may fall under that rule as well. That's one reason I don't keep much personal files and data on a smartphone. It's just a web access tool and a backup phone. I have a separate simple mobile with which I prefer to make calls and store numbers.

scheidel21
scheidel21

@codepoke @codepoke Unfortunately I think this only makes the case that collecting this information is important to hunting subversives and terrorists. I get the idea that it's supposed to engender the idea that we as the United States would not have won independence if this type of data mining were available in revolutionary times. But in showing just how effective it was in tracking someone key in the revolutionary war you have shown how effective it is in general. People that implicitly trust the government, or fear terrorism more than they fear what the government could do, will see this example as a positive. Of course it's in the best interest of a government to protect it's interests and want to quell rebellion, but we have to remember one man's freedom fighter is another man's terrorist. To the Crown our heroes of the revolution were nothing more than treasonous rats, but of course once independence was won they were the founders of a new government and heroes to the people they won freedom for.

Michael Kassner
Michael Kassner

@glwright1262 

Thank you for pointing this out. I find that really interesting -- is there a need for a legal definition of metadata?

It seems the NSA incorporates a different classification system.

scheidel21
scheidel21

@marks Not only do you choose to use those services, but you agree to abide by the terms to use those services with those organizations. In addition to that they can easily be regulated and reigned in by the government if it does it's job or regulation. However, the government has not traditionally done the best job at policing itself. The question is who watches the watchers? In the US it is supposedly the people, but some people don't care and trust the government implicitly and then you have the lack of transparency to block most of the others from knowing what's going on. I admit some obfuscation is needed but other than a public relations nightmare why shouldn't it be known that the US government is tracking this metadata?

Ultimately though a government can do far more harm to its people than a company can, and who protects you from the government?

Michael Kassner
Michael Kassner

@marks 

The fuss is the ability to actually do something with the data. Up until recently, datamining was not possible on this scale. So all that data captured earlier is only now becoming relevant. 


whitewolf60
whitewolf60

@tvmuzik Some of us are not asleep.

I'm guessing that, while you're asleep, the door to your home is closed, perhaps even locked. Why would someone who has "nothing to hide" NOT leave their door ajar?

And don't be surprised if you wake up one day to the butt of a government rifle upside your head, DESPITE having locked your door!

Michael Kassner
Michael Kassner

@oterrya 

I think their perception is that the phone number is metadata describing the actual phone call -- similar to an IP address describing TCP traffic. 

And the listing of what is considered metadata above came from the Guardian release of what the NSA calls metadata.

iIekead
iIekead

@oterrya Outstanding clarification of the difference between metadata and data.

Michael Kassner
Michael Kassner

@clickyspinny 

Not the point, actually. What is new now is the ability to datamine that extent of data with some degree of accuracy.

tswartz
tswartz

@mcarr You hit the nail on the head! These organizations (governments and publicly traded companies) have no self restraint. Everything in their grasp is fair game, like a 2 yr. old in a candy store.

Michael Kassner
Michael Kassner

@pipervt89 

You are right, and I was surprised that I had a hard time finding anything more recent that was as comprehensive. 

tswartz
tswartz

@bluntsage Unfortunately, none of these activities that have been revealed recently have anything to do with our safety, but rather the safety of the elites and government/political class. we can change this. refuse to vote. alternately, only vote for someone who has NEVER held public office as you have to be captured by special interests in order to hold office for any length of time. flood the system with outsiders who don't know the program. by the time they are corrupted, they are up for re-election. lather, rinse repeat until all the refuse is flushed out of dc.

Michael Kassner
Michael Kassner

@Datadad 

Thank you for the insightful comment. And the flashback to one of my favorite authors and book. I must revisit it and soon. 

william.purcell
william.purcell

@ahanse Right on!  I'm far less worried about the metadata collected by government than the excruciatingly detailed information collected by corporate snoopers onliine.  If you don't believe that ruthless businesses have your contact lists and call logs from smartphones, dream on.  

Michael Kassner
Michael Kassner

@ahanse 

That is true, but.... The ability to analysis vast quantities of data, and increased sophistication of algorithms are recent breakthroughs. It is these achievements that have elevated metadata to the forefront. 

homeronline
homeronline

@Michael Kassner @Adam_12345  I'm not so much worried about the government making wrong assumption as much as making the right assumptions. If the IRS can look at meta data and know that you are for cutting taxes and reducing their revenue, they can audit you. If the FDA knows you are against legislation limiting herbal remedies, they can target your company. If the EPA knows you are for natural gas and againsts wind power, they will use that information to target you. If the Dept. of Education knows you are against property taxes and for home schooling, they can find ways to target you and pass laws to eliminate home schooling. Since government already impacts nearly every aspect of our lives, I could go on and on and on with examples...

Just please know that Meta Data can be very powerful stuff and given the opportunity, the government WILL use it to attack those who disagree with its policies or seek to limit its growth and power.

Michael Kassner
Michael Kassner

@Snak @clickyspinny 

That is ironic. And just to let you know, I have no idea as to what else shows up on any given day. I learn the same time you do.