Collaboration

Why is my Internet different from your Internet?

At home you search for something on Google. Ten minutes later, at work, you enter the exact same query into Google, but get different results. Why?

December 4th, 2009 was a pivotal day for the Internet. Still, as Eli Pariser points out in his new book, The Filter Bubble, very few people noticed what the search giant Google had done. Fortunately:

"Search engine blogger Danny Sullivan pores over the items on Google's blog, looking for clues about where the monolith is headed next, and to him, the post was a big deal. In fact, he wrote later that day, it was the biggest change that has ever happened in search engines."

Filter bubble? What is it?

Mr. Pariser's book is titled after the phenomenon he calls the "filter bubble". He explains what it's all about in the book:

"The new generation of Internet filters looks at things you seem to like-the actual things you've done, or the things people like you like-and tries to extrapolate. They are prediction engines, constantly creating and refining a theory of who you are and what you'll do and want next.

Together these engines create a unique universe of information for each of us-what I've come to call a filter bubble-which fundamentally alters the way we encounter ideas and information."

What Google has known all along

For some time now, Google has been capturing the following information:

  • Search History: Google keeps track of what is clicked on in search results. If Google notices a certain site is picked more often, it will get a rankings boost.
  • Signed-Out Web History: This history is browser-centric. Google tracks all the searches and search-result selections.
  • Signed-In Web History: This history is user-centric. If the user is recognized by Google, everything is tracked.

Google uses the above data to provide customized-search results to signed-in account owners who give their permission.

What changed?

So what was this dramatic change? Google altered Personal Search, enabling it for everyone not just those logged on, by using what they call signed-out customization:

"When you're not signed in, Google customizes your search experience based on past search information linked to your browser, using a cookie. Google stores up to 180 days of signed-out search activity linked to your browser's cookie, including queries and results you click."

Turning Personal Search on for everyone concerned Mr. Sullivan. Calling it the "New Normal", he explains:

"The days of ‘normal' search results that everyone sees are now over. Personalized results are the ‘new normal,' and the change is going to shift the search world and society in general in unpredictable ways."

To put it another way, Mr. Sullivan mentions:

"Happy that you're ranking in the top results for a term that's important to you?

Look again. Turn off personalized search, and you might discover that your top billing is due to the way the personalized system is a huge ego search reinforcement tool. If you visit your own site often, your own site ranks better in your own results-but not for everyone else."

And, here I thought my articles were getting high rankings because of their merit. Ouch.

PageRank and then some

PageRank is what made Google famous, more than a few people rich, and how Google rates web pages. In 2009, Google altered their holy grail, in order to revamp Personal Search. Mr. Pariser, in his book, points out that Google now uses 57 different variables or "signals" to create search results tailored specifically for you. Some of the known signals are:

  • Search history
  • Location
  • Active browser
  • Computer being used
  • Language configured

I suspect the other 52 will remain secret, much like the formula for Coke.

What it all means

Ever have one of those feelings that something doesn't seem right, but you can't put your finger on it? I suspect that's why it took me until now to realize the implication of Google's Personal Search. And, why Mr. Pariser has spent a great deal of time and effort coming to his conclusions.

I'm glad I read the book. Understanding Mr. Pariser's concerns will help me gage search results more realistically. For the time-challenged, Mike Elgan offers a synopsis of the book, in his blog post, How to pop your Internet ‘filter bubble':

"In this column, I'm going to tell you how personalization works, why you may not want it, and also how to pop the bubble and opt out of a system that censors your Internet based on stereotyping."

I found the following tips by Mr. Elgan useful:

  • Deliberately click on links that make it hard for the personalization engines to pigeonhole you. Make yourself difficult to stereotype.
  • Erase your browser history and cookies from time to time.
  • Use an "incognito" window for exploring content you don't want too much of later.
  • Use Twitter instead of Facebook for news. (Twitter doesn't personalize.)
Update: As for Twitter and Facebook, I just read a Yahoo Finance article prepared by WSJ and felt compelled to share it with you. The article refers to the Facebook "Like" button and Twitter's "Tweet" button that is displayed on web pages:

"These so-called social widgets, which appear atop stories on news sites or alongside products on retail sites, notify Facebook and Twitter that a person visited those sites even when users don't click on the buttons, according to a study done for The Wall Street Journal."

The article goes on to explain something that may surprise you:

"For this to work, a person only needs to have logged into Facebook or Twitter once in the past month. The sites will continue to collect browsing data, even if the person closes their browser or turns off their computers, until that person explicitly logs out of their Facebook or Twitter accounts."

How about that?

An afterthought

The advantage afforded those with the ability to manipulate search-engine results is huge. And, I was interested in learning what Mr. Pariser and Mr. Sullivan thought about that. Time did not allow Mr. Pariser to respond. Mr. Sullivan did.

Kassner: Ultimately, my concern is how do we know that queried search results are not forced biases leading us to follow someone else's agenda? Sullivan: I think despite personalization, the search results still reflect lots of diversity. I also think that results are only the start of research into a new area. Wherever you end up, you'll probably get some pointers to other material-and that also leads to greater diversity.

I also think it's easy to assume the worse. My friends are all liberal (let's say), so I'll never see anything but a liberal view of the world. Perhaps. But the reality is that some of your friends will probably point toward some anti-liberal material, as part of their discussions. And that's exposing you to more diversity.

Assuming the worse, Google could intentionally try to bias its search results to a particular view. But that assumes there's a particular view on literally billions of unique searches that are done each month. There's just not. Some of them have no particular slant one way or another. But even if you managed it, as I said, some of those resources (just like your friends) will point toward content they don't agree with.

The challenge isn't that we won't get exposed to contrary statements. The challenge is that people are seemingly more and more happy to ignore contrary material and create their own beliefs without any critical thinking. "True Enough" is a good book on this topic. Perhaps this really isn't something new but rather has always been there. But it sure feels new to me.

Kassner:I am seeing people preferring to use links mentioned by Twitter and Facebook. They trust those opinions over the search engines. Are you seeing that as well? Do you see this as a growing trend? Sullivan: I do see it growing, and it's because our social networks offline have "caught up" to being as accessible as search engines for quick answers. We can ask many people for answers to anything, and that's particularly attractive for subjective questions where there's no right answer, where we want opinions from those we know. Kassner: What is your opinion on the general health of search today? Sullivan: I think the general health is actually pretty good. We should look for search engines to do more to increase quality, which means probably relying less on the link-based systems of ranking that worked in the past and more toward using social signals as well as our own behavior. Kassner: Good advice. I intend on heeding it.

Final thoughts

My goal is to make you aware of what Mr. Pariser calls the filter bubble. And, explain why my Internet is different from your Internet. Just knowing search customization is happening is more than half the battle.

I learned a great deal from Mr. Sullivan about a subject I thought I understood. I was wrong and I thank him for his help.

About

Information is my field...Writing is my passion...Coupling the two is my mission.

43 comments
The_Timelord
The_Timelord

It's .NOT. only a 'filter' bubble .OR. a certain direction.

When I submit  the same question at the SAME TIME on 2 different IP's    

( one Cable , one ADSL)

I get different answers , companies even different directions.

It seems to be the preferences from the Primary connections.

Slayer_
Slayer_

and it popped up "vb6 detect 64 bit OS" the exact thing I was looking for. Creepy.... Also, FYI, the solutions don't really work, whatever the API returns as TRUE is not really TRUE, its not false either, had to cstr it, then cbool it again.

ps.techrep
ps.techrep

Just like you should flush the toilet after every use, and you shred documents that contain personal information, configure your systems to automatically flush caches, cookies and history each time you exit your browser. It's good basic hygiene.

l_e_cox
l_e_cox

Isn't this Google "feature" basically just a marketing gimmick to impress their advertisers? We should remember that (as far as I can tell) Google makes its living by selling advertising. So they want to convince advertisers and potential advertisers that when they sign up with Google, their ads will be seen by the people most likely to respond. The more ways Google can show their customers that they are delivering more potential business to them than any other search engine, the more likely they will be to sign on with or stick with Google. I have no idea how much this actually affects search results. The article seems to imply that Google has some sort of "social engineering" agenda, but I think this is unlikely. What is more likely is that those who DO have such agendas could force Google, through governments, to acquire data for them or share data with them. There is some concern, then, that this whole "personalization" technology could backfire on the "free enterprise" culture and turn it into a total control culture. The potential is there.

Who Am I Really
Who Am I Really

as I delete their stupid cookies at every browser close and several times during a session I know they keep the tracks on their end but I don't know how much if anything it changes / filters etc. every time I turn off / power cycle or "restart" the modem I get located in a different city, according to the sidebar on the g search results page, today they think I'm in Edmonton ? WTH and I get a different IP address, which is then NAT'ed again by the ISP the modem says it has "X" IP address, today it's in the 25.xxx.xxx.xxx range I've had addresses in the 10.xxx.xxx.xxx range also and services like "what's my IP" show the IP as "Y" which today is in the 74.xxx.xxx.xxx range and is shared among many customers of the ISP .

howiem
howiem

Michael, I was a bit concerned after watching Eli Pariser's video when he said, "We get trapped in a "filter bubble" and don't get exposed to information that could challenge or broaden our worldview", My concern is the phrase "and don't get exposed". Who should the "exposer" be? Is Pariser advocating that someone or something should decide for us that we are not getting enough exposure to what they think we should get? Isn't he telling us that there should be a filter bubble, but instead of Google being the "exposer" it should be someone or something else? This smacks of the thinking in Cass Sunstein's article in the Daily We back in 2001. http://bostonreview.net/BR26.3/sunstein.php where he is advocating that there be information controllers to make sure we stupid peons get "balanced" and "fair" content. Do we really want government or anyone else dictating the content we are allowed to see? Who is going to be the "exposer"?

Slayer_
Slayer_

I was wondering what they changed, why my father can always find stuff on the net that I can't, and that the google guessing thingy when you type always seems oddly accurate. Just yesterday I typed in Police and it popped up "Winnipeg Police Credit Union" The exact thing I was looking for. I had never searched it before. I guess because I search up a lot of Credit union websites, it figured it out. I just tried again with 1st choice, the correct option was the 8th in the list. Confused it a bit with "Leroy" Had to type "Leroy C" and it figured it out. Pretty cool I say...

jck
jck

signed-off, browser-based result prediction would lead me to believe i'd get more adult-oriented search results. :-0 :^0 :p EDIT: not at work...just at home. ;)

Spitfire_Sysop
Spitfire_Sysop

http://www.startpage.com/ By the nature of this site they will not be doing this to your search results because they don't keep records. I have found that I get wildly different results from ixquick than I do from google. Some times this is good and sometimes it is not. It's always nice to have a second opinion. Try yahoo.com too, they still exist! My point is to slowly get away from google, one search at a time. They will miss you. ;-D

bboyd
bboyd

Security researcher topic #14, Learn how to manipulate spoofed computer signature to make it appear to Google that target A has certain search priorities in order to divert target to phishing site made just for them. Guess I'll move more trust to scroogle and put my browser on a flash drive.

Thomas907
Thomas907

If Google searched About 5,880,000 results (0.28 seconds), why can't I see the 4,880,000th result? What is it or how do I know they went through that many if they can't or won't show any particular result after about 1,000?

pgit
pgit

I've been a search ninja going back to altavista days. I have used scroogle as the primary interface, which scrubs all but one of the 57;" google gets the search terms in the aggregate and nothing more from me. But, now that you mention it, around the beginning of last year I noticed a divergence between the vale of results I get from scroogle versus going through google directly. The latter suck now. I have heard some 'alternative media' types saying they have more than just the opinion that one thing google is doing is stifling non-corporate viewpoints. (and the biggest corporation of all is "government") I have found myself often saying, and have heard may others saying the same, that things we used to be able to find on the internet (very easily) just don't seem to be available any more. I used to be able to search up some arcane thing I'd find myself talking to someone about, but recently those same sites don't come up in the same search I used to find them before, and overall the results are often irrelevant. I have suspected that copyright, so-called "intellectual property' and other fancy names for greed have prompted google to "optimize" search away from the wild west of distributed content, into the hands of fewer and fewer "interested" (invested) channels. I could give an example of something I used to find routinely, but nowadays if there isn't some channel that's paid the appropriate interested parties in order to provide the content, it's simply not available anymore. (1960's TV content, shows and commercials) It's not necessarily that the content I used to find is not out there, I'm sure a lot of it is. It's that google doesn't help me find it any more. BTW I'm sure google has helped "authorities" find and remove a lot of "unlicensed" content over the years. I've always viewed google more as a clamp on the internet than any gateway.

seanferd
seanferd

The really horrific part is that, for practical purposes, the filters are teh suck. They aren't filtering out the junk I do not want, despite having plenty of training time. I'd swear Google is not paying attention to the search results I click. the similar function is sometimes sadly lacking. And the advanced search function for "don't include results for pages containing these words" could use an adjunct to help with negative filtering, such as a not similar filter, once similar works better than it does now. Perhaps that is more horrific to me, but certainly, tailored results of the nature which Google is providing have other negative effects as noted. Who needs confirmation bias when bias confirmation is available from a search engine?

NickNielsen
NickNielsen

I want one of my own! ;) Actually, Michael, I'm waiting for my copy of the book to come in. I saw Eli's talk at TED.com (I think I got the original tip from CNN). I've been back to watch it a couple more times, and each time I catch more. I finally decided to order the book. The problem Google and Facebook have, Michael, is that people with eclectics tastes (I'll include the two of us. :D ) are very difficult to categorize. I like knowing the sources of information, so I'll click on links to just about anything if I'm interested.. I don't doubt it gives Google fits.

Michael Kassner
Michael Kassner

To see the results on a completely different computer.

Michael Kassner
Michael Kassner

But, until all 57 signals are known, yours is only an assumption. Test it. Run a search with Google and then the same query with Scoogle.

Michael Kassner
Michael Kassner

Whether Google or other search engines capture data and what they do with it is not what I am musing about in this article. What concerns me is most people think search results are the same for different people and if different browsers are used. Neither is the case. I would want to know that, and how it is skewed because of what is known about me.

ultimitloozer
ultimitloozer

how you were able to do anything on the internet with a 10.x.x.x IP address since it is a private address block and not internet routable.

Michael Kassner
Michael Kassner

Use a different computer and browser, then compare results. I use Scroogle as a comparison, supposedly they do not filter anything. I see dramatic differences every time I compare Google results with it, and I am locked down for the most part.

Michael Kassner
Michael Kassner

By Google customizing search results, we are not getting exposed to all (what I call raw) results. They are tempered based on your signals and previous searches. That is what he is wanting us to understand. He used an example of searching for a topic and selecting several of the liberal viewpoints to read. Next time the exact same subject was queried, all that surfaced were liberal viewpoints. That is his concern. It's like a looping mechanism, where results feed off of previous results in an ever-tightening view point. I have been using Scroogle lately to test if it's happening to me.

Michael Kassner
Michael Kassner

That is my goal for this article. Once, we know it's happening, it can be used to advantage or avoided if so desired. If I want everything, I use Scroogle.

Michael Kassner
Michael Kassner

But, I thought I read that Yahoo is also doing customization.

Michael Kassner
Michael Kassner

Is Google also has the signed-out version that is customized for each particular browser. So, it will still be different.

Michael Kassner
Michael Kassner

I am finding it's not just Google. And, to be fair, most search engines that are doing this feel it is a service. What Sullivan and Pariser want to point out is that the process has ramifications. I would love to read what you thought about the book, after reading it. It was more than interesting in my case.

Michael Kassner
Michael Kassner

That you are tightly controlling your browser. Doing so, takes many of the 57 signals out of play. Also, I suspect this customizing is mainly aimed at what TPV adverts you are served. Those, I also suspect are rigorously corralled.

Michael Kassner
Michael Kassner

If you are not aware of this and use a different computer. You will get results that are customized for that browser. Being aware of what search engines are doing should help though.

pgit
pgit

I did the same search and got my own tailored results. Top hit was wikipedia, because very often I am looking for very basic information of something arcane, probably never heard of before like some author, a movie or book, or a concept. (philosophical, taxonomic, what have you...) I hit wikipedia a lot after searching things, in general. Second and third, and fifth and beyond were tutorials, resource centers and other "how to" regarding visual basic 6, all of entry level, introductory type info, certainly nothing as specific as detecting a 64 bit OS with vb6 scripting. The fourth was a "download center." I suppose I have sough downloads frequently enough that the goog offers me a dl link near the top. Often I will hear of some app or OS etc, do some basic research as to whether it's worth my time researching further, and then downloading those projects I deem worthy of a hands on look. So it seems we are both rather well known by google.

Who Am I Really
Who Am I Really

get this ! an HSPA+ 3G wireless connection I can't get cable or even ADSL where I'm located when I first got it connected I had several different 10.x.x.x addresses over several weeks and now I'm getting 25.x.x.x addresses which are different every time the hub reboots itself or I reboot it because the signal has gone to crap but, whenever I do a "what's my IP" I get an external IP address of my ISP and the external IP address of the modem is not the one returned by what's my IP . etu

Who Am I Really
Who Am I Really

g search results identical to the results in Firefox and also match the results for the same search on the other system using Firefox y search results only the top result matched the g search

Who Am I Really
Who Am I Really

I'll do the unthinkable and open IE to see what happens with the results. Between the two systems (Office / Home Office) I run an identical Firefox config. - NoScript - AdBlock Plus - FlashBlock - BetterPrivacy and I generally get similar results eg. [b]0[/b] always gives me the same first page of results on both systems with the only thing I see changing is the bottom few on the page jockeying for position and the count last search for 0 gives About 25,270,000,000 results (0.07 seconds) .

seanferd
seanferd

Taking the second point first, Google still does arrange results differently per user, although ads certainly are a focus. If they did it well, I'd more often use my less "filtered" browser accounts with the Goog's data-gathering turned on at my Google dashboard (to address your first point). But these are very good points, indeed, and I certainly may do things which radically skew results I would expect to be more "personalized", since what Teh Algorithmz thinks is relevant, I frequently seem to find otherwise. ;) But targeted ads in general; oh boy. Targeted ads, targeted by whatever means, and targeted per user or per page, quite often hit the exact opposite of their target. (Somehow, no one has bothered coding their analytical monstrosities to understand negative references.)

Michael Kassner
Michael Kassner

That eliminates using the routable IP address as one of the signals. Interesting, one advantage to IPv4 versus IPv6 possibly.

Who Am I Really
Who Am I Really

I don't allow any cookies, history, etc. to be stored on any machine I touch so where ever this profiling is happening must be on their servers using the "undisclosed" methods the XP-64 system is on a corporate domain the other system is on my home office workgroup where I hardly search for anything (but when I do I add garbage searches in between) and the funny thing is when I do searches nothing in the results appears targeted at me even to get to TR I don't use a bookmark I use the g search box on a blank page and you'd think by now that it would suggest TR first yet it doesn't until I enter the "r" I have to type techr before the suggestion for TR comes up

Michael Kassner
Michael Kassner

My point is that your results will be different from someone else. Ask a different person on a different computer to do the same.

Who Am I Really
Who Am I Really

eg [b]jump in the lake[/b] will produce the same results on both machines and on the XP-64 box in both browsers Firefox and IE7 g = google g search y = yahoo y search etu .

Who Am I Really
Who Am I Really

on 2 different systems on 2 different ISP System 1 XP SP3 Firefox 3.6.17 IE6 not allowed to run System 2 XP -64 x86 IE7 Firefox 3.6.17 the search results from the day before on XP SP3 w/ Firefox & g search was the same the next day on the XP-64 machine in both IE7 and Firefox using "g" search then, switching the search engine to Yahoo in Firefox on the XP-64 system and sending the same search only the first result matched the other 9 on the results page were different and the results count was significantly smaller by about 23 mil g search over 25 mil y search for the same 2 mil etu

Michael Kassner
Michael Kassner

What you mean. Could you explain in a bit more detail. Thanks.

Michael Kassner
Michael Kassner

I only log on to Amazon when I know exactly what I want. Searching for various stuff, skews the results Amazon provides the next time you log on.

Editor's Picks