In a traditional IT project, a business case is identified, a system is developed to meet the needs of the business case, timelines for deliverables are drawn up, and everyone enlisted in the project is tasked with work that must conform to documented requirements and come in on time. There are few ambiguities in well-constructed IT projects, and everyone understands marching orders.
This isn’t always the case in data science, in which business cases can be drawn up but arriving at the desired results isn’t always straightforward and predictable. In fact, the only hard metric that seems to exist for most data science projects is that the results derived from algorithms operating on data must be at least 95% “right” when compared with an accepted standard for determining correctness.
This fluidity can make it seem impossible to manage these projects for the best results. Here are some ways to make that management go smoothly.
Create a well-defined business framework for every data science project
If your goal is to determine what the company’s customers will want to see from products two to three years from now, evaluating trends and customer feedback data and looking for demographics are vectors into this data, and a data science team would likely be directed to develop algorithms and discoverable data that could uncover these trends for future product development.
If the goal is discovering how to formulate a vaccine for a specific illness, algorithmic probes should be directed against the right subsets of data so these can point to directions that company chemists should take.
In both cases, you should have a person who understands the mission of the business and how data science can help. This person can ensure that the team stays on task and that there is no project drift away from the original business case. To keep the focus on the business, more than likely this person is going to come from IT or the end business.
Focus on addressing the business case, but don’t forget the benefits of project drift
Data science is an iterative discipline. It experiments with many different algorithms and data types on the way to solving an issue. But because its data lens is wider than that of traditional IT, there’s always the possibility of ancillary discoveries from the data that may not be related to the direct focus of the business case at hand.
SEE: Prescriptive analytics: An insider’s guide (free PDF) (TechRepublic)
For instance, on the way to finding the perfect blend of elements for a needed vaccine, the team might inadvertently come across another combination of elements that has the potential to address a different health problem.
These spin-off insights should not be discarded. At a minimum, they should be placed in a “parking lot” of future data probes as they serve to leverage the value of a data science project.
Build a diverse project team
Many companies have focused on hiring the most senior data scientists, but it also pays to develop internal talent with an aptitude for data science, or to recruit new grads from universities. Junior people can bring new insights and business savvy to projects. They can also bring people skills to the team that foster effective communication with other business departments.
Manage your superstars
I have seen companies literally held captive by their IT gurus—afraid to anger them, and totally at their mercy. If this dynamic takes shape on your project team, you should sit down with the superstar to see if a more cooperative and team-oriented attitude is possible. If not, you have the harder decision to make: Whether it’s a risk to the project to keep this person on board.
Continuously communicate with stakeholders and upper management
Upper management and stakeholders might give lip service to the data science function, but they are not experts in project management, and there will be an inclination to evaluate the success or lack of success in data science the same way that IT projects are evaluated.
Comparing data science to IT projects is not comparing apples to apples. In IT, project results and workflows are well defined. In contrast, data science is an iterative, nonlinear process that by its very nature can be unpredictable.
Management and stakeholders need to understand these differences so they can adjust their expectations accordingly.
Don’t skimp on data quality
Risks arise when there is pressure from stakeholders and management to get results from data science projects, and the data being evaluated is not of highest quality. If the data isn’t top quality, the results emanating from the algorithms won’t be, either. The result can be faulty conclusions that misdirect business decisions, and no one wants that. Always assure the best quality for your data before moving data science projects into production.