Big Data

How to develop a data science training program inside your company

Finding experienced data science professionals can be a challenge. Training current employees with aptitude for this type of work could be a good strategy to fill skill gaps.

Recently, TechRepublic reported on the high demand for data scientists, referencing IBM's prediction that by 2020, there will be around 2.7 million job openings for data-savvy professionals. The demand for workers with experience in data science is so high that, according to Glassdoor, these professionals command a median salary of $96,441, and much higher in some cities.

If your company is struggling to hire data science professionals, the good news is that cross-training internal IT staff can be a successful strategy. According to a recent report from data science community Kaggle, 59% of employed data scientists gained their skills from self-guided learning or open online courses.

As a corporate training director, I was once asked to develop a "from scratch" curriculum that could train entry-level and cross-train more experienced employees in the skills needed to code online transactions for a stock trading system. The training consisted of programming skills, but also skills training in teleprocessing engines, operating systems, databases, code libraries and conventions, call routines, and the end-to-end software development process.

It was a complex task. The goal was to be able to infuse a newly-trained person into the heart of a project where he or she could immediately be productive developing code in a highly demanding environment.

The same approach can be taken to developing internal data science expertise. Here are five essential steps:

1. Analyze the task and skills gaps in your company's projects.

A good way to start is to visit with project managers who are in charge of big data and analytics. Where are their project shortfalls? What project roles are they having trouble staffing? What specific technical and personal skills are needed? Are any project tasks getting delayed because there is no one who can do them? Based on your findings, you can compile a list of task and skills gaps by project.

SEE: Data scientist job description (Tech Pro Research)

2. Map your skills needs findings to internal staff

The next step is to assess internal personnel to see who has the aptitude and background to step into these tasks and skills gaps, and then identify them as trainees. You do this by looking at individual IT experience with the company and researching employees' past work experience. It is also important to visit with project managers to learn more about the individuals being considered, and about their aptitudes and interests.

3. Design a curriculum, and find a project

It never works to get the employees you select to just work on skills development in an isolated lab setting. Labs are fine for developing skills, but what makes these acquired skills '"take" for good is applying them in real projects, where employees build experience and confidence.

SEE: How to build a successful data scientist career (free PDF) (TechRepublic)

4. Continuously communicate with project managers

Stay in touch with managers of projects where newly-trained employees are deployed so you can see how it's going. This enables you to build rapport with managers. It is likewise useful to evaluate the effectiveness of training and skills transfer by meeting with project managers after projects have been completed. You are likely to find areas in your curriculum where training went well, and areas where it can be strengthened.

5. Continuously revise the curriculum to keep up with real world project requirements.

Some project needs will remain relatively constant while others will evolve as technology and business changes. It is essential, if you are developing training, to keep pace with these changes so your training always delivers the skills education that your projects need. You can ensure this cohesion by constantly evaluating projects, and then going back to your curriculum to ensure that the training is in sync with project needs.

Finally, I'll borrow a phrase from Sara Sproehnle, vice president of educational services at Cloudera: "You can easily cross-train people," said Sproehnle. "It's not that the technology is incomprehensible. You just need to take existing developers, analysts and admins and cross-train them."

Sproehnle is spot-on. The strategy can really work if more corporate IT departments take big data and analytics training into their own hands.

Big data
iStockphoto/Amiak

Also see:

About Mary Shacklett

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...

Editor's Picks

Free Newsletters, In your Inbox