Big Data

3 best practices for exploratory data visualizations

Exploratory data visualization is an exciting component of data science. Get tips on how your qualitative data scientists can build these visualizations.

istock-640026090bigdataanalytics.jpg
Image: NicoElNino, Getty Images/iStockphoto

Visualization is an often-underrepresented member of the data science family. When it comes to data science, most leaders think about processing massive amounts of data and using sophisticated algorithms and artificial intelligence techniques to extricate treasure from the vast digital unknown. Analysis means nothing if we don't take time to organize the visual elements of the analysis into some comprehensive format.

SEE: Big data policy (Tech Pro Research)

That said, leaders do not put enough emphasis on developing their visualization strategy, especially when it comes to exploratory data visualization. Without a good exploratory data visualization program, you may be missing critical insights that are waiting to be discovered in your data.

What are exploratory data visualizations?

Exploratory data visualizations (EDVs) are the type of visualizations you assemble when you do not have a clue about what information lies within your data. For this situation, it is best to employ the services of data scientists who specialize in qualitative research—this is your data mining team. They will use the appropriate exploratory data analysis (EDA) tools and techniques to comb through your data and attempt to extract insights that suggest a direction for further research.

In my view, this is the only legitimate business case for using data visualizations. Oftentimes, enthusiastic data science practitioners compile data visualizations for the wrong reasons. If you ask a busy executive to stare at a visualization (as elegant as it might be) and divine her own insights, it could be a career-limiting move.

SEE: 60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)

The right people to invite to the visualization party are those with enough motivation and patience to contemplate your masterpiece over hours, days, or even weeks. Don't get me wrong, this could be a very special type of executive, but that is not usually the case.

I am sure other data scientists will find it worthwhile to spend time on your visualization, and peer reviews are always a good idea, but those are not the people who will add the most value to your research. You must seek out the right people in the business to complete your EDA.

The importance of partnering with the right business experts

The right business partners are crucial to a successful EDA effort. It is a big mistake to assume your data science team has all the capabilities necessary to complete this analysis effectively, even if it is staffed with talented business analysts. This is a worthwhile caution, so take heed.

Over time, data scientists should cultivate a conversational understanding of how your business works. It is tempting to rely on the advice of the more business savvy members of your data science team, especially during EDA, but don't do it. Business experts in the company are far better equipped to interpret EDVs once they understand what they are looking at, so engage them as soon as your data scientists have one or more visualizations ready.

SEE: Data lakes going the way of the visual spreadsheet? (ZDNet)

Target people in the business who have in-depth knowledge of their job and the circumstances of your EDA scenario, an analytic personality or proclivity, and time to spend contemplating the visualizations your data scientists have prepared. Don't waste time with people who are not qualified for the job just because they are available.

Find the best business experts for your purpose and make sure your EDA efforts are a priority using their leadership. Finally, don't force non-analytics to analyze; they will be consuming data science artifacts—they need to be analytics.

Best practices for building these data visualizations

After you assemble your qualitative data scientists, start working on your EDA, and target the right business experts for collaboration, it will not be long before the magic starts. If your data scientists have the right tools to quickly collect and clean the data, they should be ready to assemble visualizations before long. Here are my best practices for building these visualizations.

  • Start with the interface and graphics that come with the tools you use. You want to engage your business experts as quickly and often as possible. Do not assume your business experts cannot understand your tools' graphical user interface; if needed, spend time explaining how the interface works—it is time well spent. If your business experts are analytics, they will pick it up quickly.
  • Use R, Python, or something similar to build custom visualizations. This is worth the effort, even if you have great tools that come with high-powered graphical interfaces. Every business is unique, and every situation is different—custom visualizations bring the greatest insights.
  • Take an iterative approach. By definition EDA is well... exploratory. You want to do several rounds of EDV collaboration with your business experts to get the best results. As noted above, start easy. Use what you have and bring the business experts in soon. Then help the business experts understand where else you can go with it, like the custom visualizations referenced in the second bullet. Have your business experts systematically advise you on what to visualize next until the EDA is complete. This style of iterative engagement is crucial. Before long, your business experts will be coming to you with ideas.

SEE: Getting started with Python: A list of free resources (free PDF) (TechRepublic)

Conclusion

Exploratory data visualization is an exciting part of data science. EDA data scientists thrive on discovering the jewels of information that are hiding in your data. Set them—and yourselves—up for great success by partnering them with your best business experts.

You should use the collaborative EDV techniques presented in this article to uncover critical hypotheses to be tested by your quantitative data scientists. The qualitative precursor to quantitative research is vital in extracting the most value from your data science program and, with the right EDV, you can visualize your way to your vision.

Also see

    About John Weathington

    John Weathington is President and CEO of Excellent Management Systems, Inc., a management consultancy that helps executives turn chaotic information into profitable wisdom.

    Editor's Picks

    Free Newsletters, In your Inbox