Data science education bare minimums for analytics teams

Everybody on your big data analytics team needs some education in data science. Here's what team members should know.



Many leaders are making the mistake of hiring their problems away. The problem isn't with the approach -- it's with the attitude.

Don't get me wrong, as a consultant and expert, I'm usually the benefactor of this way of thinking; however, on more than one occasion, I've had to educate my client on their responsibilities in the consulting process. It's fallacious to reason that an expert can solve your problems without your involvement. Unfortunately, with a subject as mysterious and obscure as data science, it's too easy to make this mistake.

Many people believe that once they've hired the best and brightest data scientists, their investment in data science competence are over, but that's not true. You should attract and retain the best data scientists for your data science team, though everybody on your data science team (including you) needs some education in data science.

Stepping up your game theory

Perhaps the best data science leaders are also data science experts, but it's not a hard-fast requirement. What leaders bring to the table is their ability to inspire and motivate the team and make important decisions about where this data science team will be taking the company. If, in addition to these qualities, the leader has a strong background in data science, this is terrific. There's nothing more inspiring for a data scientist than to work with another brilliant data scientist. And although the leadership/data science combination seems rare to find in one person, I'm seeing data science leaders emerge more today as big data and data science becomes more of a mainstream conversation in corporate America.

At a minimum, the leader should know data science terms (e.g., algorithms, programming languages, and methods), outcomes, and how those outcomes will benefit the organization. For example, a leader should know what a cluster analysis is, why it's important for customer segmentation, and how customer segmentation drives customer loyalty. A leader should know what a neural network is and how a neural network can be used to predict the behavior of the customer segment you discovered with your cluster analysis. These are basics and if you don't have the time or inclination to understand these concepts, you really shouldn't be employing these techniques in your corporate strategy.

The other people on your data science team that need a good base of data science education are the ones who round out your leadership team: change leaders and coaches. Some experts today are espousing a T-shaped skill set, wherein resources have deep expertise in one skill (leadership, change, team development) represented by the vertical line and a general, conversational knowledge of other areas (data science) so they can work more effectively with the team. I think you should take it one level up from there if you're on a data science team. You don't need to be an expert in data science, but you need more than just a conversational knowledge.

Managing by example

Management is a different story. I'm a lot more tolerant of leaders who aren't strong in data science than managers, though many people disagree with me. If you're managing a data science team, you need to know your stuff. Managers are chiefly responsible for keeping things under control, and there's no way to do that on a data science team without knowing data science.

In Six Sigma, the project's Black Belt is not only the project manager, but also the most advanced statistician (aside from the Master Black Belts, but technically they're not on the project team). The worker bees on a Six Sigma project are Green Belts, and as you might guess, they've had a good deal of statistical training, but not as much as the Black Belt. It's best to run your data science team the same way. By the way, I've seen some companies do Yellow or White Belt training for leaders, which is nonsense; your leaders should at least be the data science equivalent of a Six Sigma Green Belt.

The rest of your management team -- quality control and governance -- should have the same stringent requirements for data science education. How can you ensure a high quality of anything if you don't understand it? How can you make sure the team is staying within the lines if you don't understand where the lines are or what they mean? Your entire management team needs to have a high level of expertise in data science, and of course the other areas they're responsible for (management, quality, governance). Earlier we talked about a T-shaped skill set; I feel managers need more of an H-shaped skill set: deep expertise in both data science and management, and a solid base of knowledge in everything else.

A key danger with this philosophy is micromanagement, so install some sort of control for this early on. It's important to let the data scientists do their job, even if others on the team have data science expertise. Unfortunately, when you have management experts that are also data science experts, they inexorably feel the need to sanction the content (e.g., analyses) based on their own values, beliefs, experience, and favorite methods. This might work on a Six Sigma project, but it won't work on a data science team. This behavior will very quickly shut down the team's creativity, so you must guard against this from team inception.


An effective data science team is comprised of people who understand data science, including leaders and managers. Although it's tempting to just hire a bunch of bright data scientists and call it a day, it doesn't work that way. Ensure your leaders (including yourself) have a very good base of data science education, and your managers are experts. Survey your team today to see where each person's skill level is with data science. It may be time for some to hit the books.