Big Data

The scariest threat to the quality of IoT data and analytics

Analytics managers need to do two things to protect IoT-generated data from an often-overlooked threat.

Image: iStock/varaphoto

The quality of analytics depends upon how "clean" or authentic the data is and how quickly one can obtain the data that algorithms operate on. We have already seen instances where flu epidemics were misjudged because of incomplete data and/or data assumptions, or where market opportunities were missed because relevant market facts were overlooked in the data.

For most companies, it's virtually impossible to incorporate all data that might be relevant for the topic they plan to analyze. As organizations add the Internet of Things (IoT) into their analytics, the plot thickens because of a silent threat that few analytics managers think about: the fundamental flaws in embedded software development.

SEE: The Power of IoT and Big Data (ZDNet/TechRepublic special feature)

Embedded software—not traditional IT applications—runs machines, produces machine automation, and enables machines to talk to one another and to data repositories on the manufacturing floor and across great geographic spans.

Historically, embedded software was developed by engineers who did not universally employ the software development life cycle methods of traditional IT apps; this meant that detailed quality assurance (QA) testing on the programs, or ensuring that program upgrades were administered to all machines or products out in the field, didn't always occur.

Some of this is changing because more IT grads are entering the embedded software field, but they bring their own shortcomings—these grads understand the IT life cycle and QA testing methodology, but unlike software engineers, many of them don't grasp the roles that security, safety, and environmental "fit" play in software that is embedded in IoT products and machines.

SEE: IoT developers: Master this coding language if you want to thrive

"Embedded software can have an active life of years, and it must be continually maintained throughout that life cycle," said Andrew Girson, CEO of Barr Group, an expert systems consultancy. "A failure to follow best practices in producing and maintaining this software can impact safety and life....It's far less expensive to adopt embedded software practices that lower risk and reduce the potential for error than to deal with the repercussions of a software failure."

This impact is also felt in big data and analytics. For instance, if machine-generated data (whether streamed or collected for later batch analysis) is inaccurate because of an undetected flaw in the embedded software on the machines, the result could be an erroneous analytics conclusion that could impact the business.

"Because of this, producers of electronics products are mandating security and compliance from themselves and from their suppliers in embedded software as well as in hardware components," said Jim McElroy, Vice President of Marketing at LDRA, which provides embedded software testbeds and certifications. "Industrial sectors like automotive, aerospace, and medical equipment all have quality standards they must meet for electronic devices and equipment."

Two ways to improve the quality of IoT-generated data

Measures like these will help the quality of IoT-generated data, but there are steps that data analysts and those responsible for corporate analytics can and should take to protect and improve the quality of the IoT-generated data that they use.

Investigate unusual data immediately, and share your findings with the appropriate teams

If the analytics department identifies that the data it's analyzing is unusual and warrants further investigation, this should be addressed right away, and should be reported to the teams with machine or device end responsibilities to troubleshoot. In many cases, the machine/device-responsible teams will already have received alert messages from a machine, but there are cases when the analytics team can identify a potential data verity problem that a machine alert won't catch.

Similarly, if the team charged with end responsibility for machines/devices sees anything unusual with the data, immediate action should be taken on the floor, and they must report back to the analytics team that a potential problem could affect data.

Keep vendors on the machine and the analytics sides plugged into communications

Hardware and software, whether it is on machines or in analytics, is never perfect. Data might be skewed because there is an issue that an analytics or machine vendor is experiencing and fixing. When this occurs, the machine and the analytics teams should be communicating with each other and with the vendor so that everyone is in the loop.

The bottom line

The end goal is to ensure that the raw data you receive from IoT machines and devices is the best you can get, so that the premises and conclusions derived from the data can be followed and acted upon with confidence.

Also see

About Mary Shacklett

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...

Editor's Picks