Finding a good data scientist just got even harder. It turns out, according to former Google and Foursquare data scientist Michael Li, that there are two very different categories of data science, geared either toward machines and humans.
Hence, organizations recruiting for a data scientist position need to get beyond generic requirements and deeply understand who (or what) will be consuming the insights engineered by the data scientist.
Getting this wrong can be costly.
The data scientist premium
Organizations have been struggling for years to discover ways to put their data to use, and with the noise around big data, that struggle has become much more pronounced (and expensive) of late.
When NewVantage Partners asked IT executives how hard it is to source the analytics talent required to derive insight from ever-increasing amounts of data, 76% indicated that it was at least "challenging," with 36% acknowledging it was "very difficult."
It's therefore not surprising to see salaries for data scientists rocketing upward, with the average salary now topping $123,000, as Indeed.com data shows.
Some recruiters say that a mere two years of data science experience can translate into a $200,000 to $300,000 per year job.
Data scientist salaries are so high because the expertise required to make sense of data is also high.
As Mitchell Sanders notes, data science requires a difficult blend of domain knowledge, math and statistics expertise, and code hacking skills. In particular, he suggests that expert knowledge of tools like R and SAS are critical. "If you can't use the tools, you can't analyze the data."
He goes on to stress the importance of math skills:
"Understanding correlation, multivariate regression and all aspects of massaging data together to look at it from different angles for use in predictive and prescriptive modeling is the backbone knowledge that's really step one of revealing intelligence.... If you don't have this, all the data collection and presentation polishing in the world is meaningless."
This is the data science we mythologize, with arcane technologies and math combining in the mind of some PhD genius to reveal deep truths otherwise obscured in (and by) our data.
The problem is that this may only be half right.
The two faces of data science
After all, as Michael Li makes clear, data scientists either produce analytics for machines or humans, but generally not both. "Unfortunately," Li laments, "most hiring managers conflate the types of talent and temperament necessary for these roles."
So, how do the two types of data scientist differ?
According to Li, the jobs distinguish themselves in the following ways:
Data science for machines
- Ultimate decision maker and consumer of the analysis is a computer (e.g., ad targeting, product recommendations)
- Involves complex digital models that ingest large amounts of data and extract insights using machine learning and algorithms, then act autonomously to display certain ads or make stock trades in real time
- Such data scientists therefore require "exceptionally strong mathematical, statistical, and computational fluency to build models that can quickly make good predictions"
Data science for humans
- Ultimate decision maker and consumer of the analysis is a person (e.g., understanding user growth and retention)
- May actually be using the same data sets as the "machine" data scientist, but must package the analysis in a human-understandable format, with an emphasis on storytelling and articulation of "how" and "why" to achieve results
Importantly, while "data scientists with hard science backgrounds have traditionally gotten a lot of attention in the press," they might be the exact wrong type of data scientist for your organization. Often, the "human" data scientist can come from within your own organization, as Gartner analyst Svetlana Sicular has posited, because "Organizations already have people who know their own data better than mystical data scientists."
Such people may be the exact right fit for sifting through data and then creating the appropriate "stories" around that data to make it accessible to decision makers.
In sum, data science remains a big deal, but getting the right kind can make a huge difference in the "mileage" you get from your data scientist.
Matt is currently head of the developer ecosystem at Adobe. The views expressed are his own, not those of his employer.
Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.