Visiting Web sites provides the Web host access to more information than you realize. It may be enough to create a traceable fingerprint.
After reading articles about user privacy, I noticed that members tend to discuss how they avoid being tracked (identified) while browsing on the Internet. Most recently, in GoogleSharing: A way to prevent tracking by Google, they promoted various add-ons such as NoScript, Adblock Plus, and disallowing cookies.
I was in tentative agreement, but curious. So, I started looking into whether that was actually the case or not. After some searching, I ended up at Electronic Frontier Foundation's (EFF) Web site. The answer I found was somewhat unexpected.Some history
To explain, Netflix sponsors contests, offering huge sums of money to entrants that ascertain creative answers to their data-mining issues. On the surface, that sounds harmless. Yet, they use their actual member database. They say the information is anonymized, but to what degree?
According to the EFF, if information like Zip code, date of birth, and gender are part of the sanitized database, individual identities can be figured out. That's because when combined, individual pieces of information work together, reducing entropy.Entropy
Entropy, in the world of information sharing is the term used to gauge how identifiable an object is. EFF defines entropy as:
"A mathematical quantity which allows us to measure how close a fact comes to revealing somebody's identity uniquely. That quantity is called entropy, and it's often measured in bits."
It took me a while to figure it out, but more entropy means less identifiable. The EFF thankfully has a Web page explaining the mathematical process used to determine how much a piece of information reduces an object's entropy.Identification
Since there are approximately 7 billion living, breathing people right now, mathematicians have determined that 33 bits (two to the power of 33 is eight billion) of entropy are required for a person to remain anonymous.
Another interesting concept about entropy is identifying information can have different entropy-reducing values. For example, knowing a person's birth day and birth month provides less information (more entropy) than if the birth year is also known.Web browsers
I'll bet you are wondering where I'm going with this entropy stuff. Well, the EFF feels that every Web browser provides enough unique information to tell one from another. Besides user accounts, IP addresses, and cookies; there is something called a User Agent string that can be used to further reduce the entropy of Web browser applications:
"Our experiment to date has shown that the browser User Agent string usually carries 5-15 bits of identifying information (about 10.5 bits on average). That means on average, only one person in about 1,500 (210.5) will have the same User Agent as you.
On its own, that isn't enough to recreate cookies and track people perfectly, but in combination with another detail like geo-location to a particular ZIP code or having an uncommon browser plug-in installed, the User Agent string becomes a real privacy problem."
If you are curious about your User Agent string, visit the Web site: What's My User Agent. Here is an example of a Firefox User Agent string:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:22.214.171.124) Gecko/20091221 Firefox/3.5.7
Looking at the string, you can pick out the following information:
- Firefox version of 3.5.7
- Windows version NT5.1 which translates to Windows XP
- The user's preferred language of en-US
Three more chunks of information transmitted to the Web host, and it can be used to differentiate Web browsers.Panopticlick
The EFF developed a way to test the entropy of Web browsers — an application called Panopticlick. I must commend the EFF for providing this service. It is a great way to learn what information is automatically offered by Web browsers.
I ran some tests, ending up with both expected and unexpected results. The first slide shows what Panopticlick found with my Firefox browser in locked-down mode. No cookies and NoScript disallowing everything:
That configuration contributed 10.55 bits of information. Let's see if allowing cookies changes anything:
The count is less (8.65 bits), meaning this configuration is offering less information to the Web host. I suspect that not allowing cookies made my Web browser more unique. Let me know if you get the same results. Next, let's shut off NoScript and see what happens:
The EFF offers several solutions that will help prevent Web browser fingerprinting:
- Use NoScript, as it blocks Web sites from detecting plug-ins, fonts, and cookies.
- Use TorButton when accessing the Tor network. It changes the transmitted information to non-identifying values.
- Switch to a popular Web browser. It decreases the likelihood of being unique among other Web browser fingerprints.
The EFF admits they are finding it near impossible to find a non-unique Web browser. In fact, smart-phone Web browsers are the only ones that come close. That's because for the most part they are not changed from their default condition.Final thoughts
Before researching this article, I was unaware that it's possible for a Web browser to have a unique and identifiable fingerprint. That said, running Panopticlick proves methods promoting on-line privacy do work. All that's left to do is reduce the amount of identifiable information your Web browser is providing.
As an aside, I was trying to understand why EFF used the term Panopticlick. Thankfully, Selena Frye, my editor at TechRepublic informed me that it probably was derived from Panopticon, a prison design that allows prisoners to be observed without their knowing it. Some how EFF's use of Panopticlick seems appropriate.
Information is my field...Writing is my passion...Coupling the two is my mission.