Big data practitioners are learning that the laboratory know-how of computer scientists and statisticians must be matched with a holistic, 360-degree vision of the problem to be solved. TheEbola crisis is a prime example.
As of March 20, 2015, the Centers for Disease Control and Prevention (CDC) was reporting 24,754 cases of Ebola and 10,236 deaths from Ebola in the West African countries of Guinea, Liberia, and Sierra Leone. How did the disease get out of control so quickly?
Analysts cite extremely impoverished conditions and lack of hygiene and healthcare spending. They also blame healthcare education campaigns that got under way too slowly, and spotty disease surveillance networks that could have tracked where and how the disease was spreading so that more proactive intervention could have been made.
Disease surveillance and tracking is where machine-generated cell phone data enters the discussion.
Researchers want to know the next geographical areas that Ebola is likely to invade. The researchers believe they can accomplish this research if they can gain access to and perform analytics on machine-generated mobile phone data. Why? Because each time a mobile phone call is made, a call data record (CDR) is generated that contains the phone numbers of the caller and receiver, the time of the call, the tower that handled it, and a rough indication of the device’s location.
Citing callers’ rights to privacy, telcos have objected to sharing this data, yet telcos already use this data to determine where to build base stations and improve their networks.
Unfortunately, the political will to avail CDR data to medical researchers seems doesn’t seem to be there. Flowminder, a group of epidemiologists from Harvard and elsewhere, has lobbied mobile phone providers for this data, as has GSMA, the mobile industry’s trade group, but government regulatory agencies and the providers have been unwilling to yield.
However, even if the political problems and inertia could be solved, there are other challenges in West Africa, such as the lack of widespread mobile access.
As of 2012, there were 501,000 cell phone users in Guinea; this places Guinea 169th among the world’s 196 countries when it comes to cell phone access. In Sierra Leone, 57% of the population has mobile access, but only 1.7% of the population has internet access. In Ivory Coast, as of 2011, only 37% of male and 4% of female refugees have access to a mobile phone. In Liberia, many rural areas lack mobile phone service altogether. Villagers, including very young children, constantly move from location to location, making connecting with these persons even more difficult.
The bottom line
If big data is going to help solve health issues like Ebola, it must be incorporated into analytics that consider all of the factors shaping the epidemic. These are three of the ingredients that should be factored into Ebola analytics.
1: There are political barriers that stand in the way of obtaining data from cell phone providers that could assist researchers in determining where the disease will strike next.
2: Even if disease researchers could obtain this data, there is a need to “correct” the data for what it doesn’t reveal. For example, if less than 50% of a country’s population has access to mobile phones and individuals are constantly moving from village to village, how will researchers be able to verify the quality of the data they’re getting unless there are people “on the ground” who can verify or provide corrective factors to the data?
3: Even if data is correct, there is a data velocity issue. The data researchers analyze must stay in sync with the rate at which individuals are migrating from one place to another.
Undoubtedly, big data will be part of the “fight” against tough and elusive crises like Ebola, but as the science of big data problem definition advances, researchers now know they cannot limit their vision to what big data alone provides. Instead, there are human and other “on the ground” realities that must be holistically factored into every analytics design before the puzzle can be solved.
Note: TechRepublic and CNET are CBS Interactive properties.