Using Apache Spark Deepsense.io helps large organizations, including the UN, scale with sophisticated data platforms built for developers but accessible to anyone.
The United Nations is a vast organization with a diverse set of technology needs. The organization is an umbrella under which numerous sub-organizations—UNHCR and UNICEF, for example—operate somewhat autonomously. The UN is charged with providing global humanitarian relief, resolving multinational disputes, and tracking global crime. Tech innovation—particularly big data—aids the UN's mission by making operations more efficient and providing critical operational insight. The UN generates and tracks information on a global scale but has trouble managing large piles of data and is staffed with policy makers who are not inherently tech-savvy.
To grapple with the challenge of big data—in support of the Sustainable Development Goals—the UN partnered with Deepsense.io, a startup that helps non-technical organizations deploy and benefit from data analysis and machine learning. The machine learning firm developed a suite of applications that are accessible to the layperson but provide deep hooks for developers.
Seahorse, the company's data processing product, is a visual interface built on Apache Spark and helps companies Extract, Transform and Load (ETL) data clusters. "The clean front end helps users to tackle complex real-world data challenges without getting bogged down in [code]," said company spokesperson Kamila Stepniowska. "It's a powerful tool, but designed for non-technical people and is pretty easy to use."
SEE: Seven ways to build brand awareness into your digital strategy (Tech Pro Research)
"Machine learning seems particularly effective in two situations: systems that are unpredictable and constantly changing and systems that are so complex that we cannot fully describe them in one model," said Lambert Hogenhout, Chief Data Analytics, Innovation and Partnerships at United Nations during the organization's recent TechNovation Brief. "The UN is facing both of these: understanding how the world works is infinitely complex, and it changes all the time. That is why I believe machine learning can be of value to us."
Data sets, local files, and code libraries are presented in Seahorse as visual nodes, then joined with actions and outputs. Though the visual interface uses a common workflow metaphor, users are not limited a predefined set of actions and can write and import their own code in Python and R, Stepniowska said. Spark acts as the machine learning glue that connects, acts on, and outputs data.
The UN also uses Neptune, a machine learning metrics and monitoring platform. Also built on Spark, Neptune inputs raw text files combined with common code libraries from H2O, Keras.io, Lasagne, Scikit-learn, TensorFlow, and Theano. The output is a visual graph that tracks and compares large logs in real time. The platform excels at long-term trend analysis. By combining global news items from a variety of sources with Twitter data, the UN uses Neptune to track the dissemination of propaganda online.
WATCH: Documentary shows information revolution of big data (CBS News)
Stepniowska cited the UN as example of how accessible and powerful machine learning tools help organizations become more nimble and tech-savvy. Like many corporate IT and innovation departments, the UN tech team is based in New York and services nearly 4,000 workers around the world. "The importance of this project goes well beyond the modeling results by providing an opportunity to explore how data science can impact programmatic needs on use cases related to UN mandates," said Radia Funna, Head of Innovation, Office of Information & Communications Technology at the event. "Presenting these case studies and results... exposes UN staff from different parts of the institution both to machine learning as a powerful tool and to the design thinking that will be useful in implementing this tool."
- Cambridge Analytica: 'We know what you want before you want it' (TechRepublic)
- Machine learning and microbes: How big data is redefining biotechnology (TechRepublic)
- Infographic: AI and machine learning in the enterprise (TechRepublic)
- How Squarespace became a multimillion dollar publishing giant (TechRepublic)
- From Russia with Tech: The top 5 most interesting Russian startups (TechRepublic)
- Election Tech: Leadership is more powerful than technology (TechRepublic)
- How to make yourself a data scientist (TechRepublic)
- Google Translate uses machine learning for its cool new trick (CNET)
- Stolen data on the dark web is cheaper than you might think (ZDNet)