If you're not careful, the data you're collecting for your marketing analysis may be telling you the wrong story about your target market. In a previous post, I talked about the dangers of not sizing up your sample properly. Today, I'd like to address a related problem - sampling bias. Sampling bias happens when the data you're collecting for your predictions does not represent the market you're trying to target. It's a classic analytic gotcha that, for some reason, is really getting a lot of big data marketers and strategists in trouble.
Again, I think it's an awareness/communication problem between the leaders and the data scientists. Data scientists know better, but they're not raising the issue with the leaders who don't know any better. The problem starts with a mismatch between the way you define your market and the way you collect data about your market. A good example is a non-incentivized customer survey. The people who respond to these don't comprise a good representative sample of your customer base; they are people who already have a behavioral disposition for engaging with your company (good or bad). However, I see leaders making marketing decisions based on this feedback all the time.
With the emergence of social media analytics, the same poltergeists are reappearing in a different form. Marketers and strategists are eager to eavesdrop on their customers by digitally listening to their conversations on the popular social media platforms; however, are these people really your market? Probably not. In many cases, the people who are posting to the social media sites are a biased subset of your market. This is not good for making market predictions.
To know how representative your social media enthusiasts are to your overall market, you must do some more analysis. I know that sounds like we're taking the road to analysis paralysis, but this one's important. Once you've properly defined your market, you must take a random sample of your entire market to determine if there's a social media bias somewhere. This is where many people get tripped up.A random sample is not an arbitrary sample. Sometimes the connotation of random throws people off; actually, the process of collecting a random sample must be very systematic. For instance, if you want to get a representative sample of 100 people sitting in a room, the wrong way to do it is to ask for volunteers. The right way to do it is to put everyone's name in a box, shake it up good, and then randomly draw 10 names. If you have doubts about your sampling techniques, you should talk with your data scientists about it.
When your market sample is collected, there should be a good mix of people who use social media and people who don't. The objective is figuring out who does and who doesn't; however, the challenge is collecting this data. Even if one person in your random sample doesn't want to participate in your exercise, you don't have a random sample anymore. This is a formidable challenge. Most people make the mistake of just taking what they can get. You can't do that! There are strict rules about random sampling; you'll either need to figure out how to get feedback from all the people in your random sample, or abort the exercise. Sorry, I don't make the rules; I just tell you what they are.
Here's the good news. If you are fortunate enough to collect good feedback on your customers' social media habits, using a valid random sample, your data scientists can easily tell you whether or not there's a social media bias. If not, you're in luck! Scrub all the social media outlets for all the data you can collect.
However, if you do have a social media bias in your market, you'll have to make some adjustments. You can either redefine your target market to include the constraint of social media predilection, or you can employ mixed-marketing data collection. Some people may recommend sticking to your social media analytics and adjusting for the known bias; however, I think this is risky and I don't advise it. You never know when your social media bias may shift.
Social media analytics are great to employ on your big data strategy; however, they set you up for a classic analytic trap called sampling bias. To stay out of this trap, put your data scientists to good use on an analysis for social media bias, using a rigorous random sampling process. Forty-Niner Quarterback Colin Kaepernick has a great arm and great instincts when the blitz is on, but I don't think these are going to help the Oakland Raiders one bit.
John Weathington is President and CEO of Excellent Management Systems, Inc., a management consultancy that helps executives turn chaotic information into profitable wisdom.