The big data pioneer on how to use machine learning to mine gold from your company's data stores.
Cern's experiments probing the fundamental nature of the universe create a lot of data—roughly four times that held in the US Library of Congress—every second.
The EU research institution has a long history of generating more data each day than most companies do in a lifetime, during its experiments colliding particles travelling at close to the speed of light.
While the bulk of this data is thrown away, Cern shares the remaining information with researchers around the world and has data on the workings of its particle accelerators—the most famous of which is the Large Hadron Collider (LHC)—stretching back decades.
Manuel Martin Marquez, lead data scientist at Cern, talked to TechRepublic about how, over the past five years, Europe's largest particle physics lab has begun employing machine learning to discover new ways to filter noise in its experiments and to keep its accelerators running smoothly.
Marquez says Cern's machine learning models are already helping engineers carry out predictive maintenance on the cryogenics system that cools the superconducting magnets used in the LHC to close to absolute zero, reducing the risk of a costly failure.
With Cern's research budget being largely flat, finding ways to make operations more efficient is increasingly important to the institution, he said, stressing how the facility had benefited from identifying machine learning projects with a clear scope that provide 'quick wins'.
Watch the video to hear how Cern is using big data and machine learning, and what your company can learn from an institution used to making sense of vast amounts of data.
- Before Big Data, clean data (TechRepublic)
- Farm out big data chores so employees can focus on analytics (TechRepublic)
- Leadership challenges of a data cleansing effort (TechRepublic)
- 6 myths about big data (TechRepublic)
- Data to analytics to AI: From descriptive to predictive analytics (ZDNet)