How Big Data is changing the security analytics landscape

Get two experts' take on Big Data security analytics, as well as details from their working group's research report on the subject.


Image credit: Picasa
 Storied, beautiful Edinburgh, Scotland hosted the EMEA Congress of the Cloud Security Alliance (CSA) in September 2013. CSA's Big Data Working Group released its 2013 Big Data for Security Intelligence report at this gathering. The new research offering outlines "how the landscape of security analytics is changing with the introduction" of Big Data tools, as well as the differences from traditional security analytics.

"The goal of Big Data analytics for security is to obtain actionable intelligence in real time," said Alvaro Cardenas, lead author of the report in the CSA press release. "Although Big Data analytics holds significant promise, there are a number of challenges that must be overcome to realize its true potential. We have only just begun, but are anxious to move forward in helping the industry understand its potential with new research directions in Big Data security."

An interview with two CSA members about the Big Data report

TechRepublic recently spoke with Alvaro Cardenas and Wilco van Ginkel, co-chair of the Working Group, about the report, how Big Data is changing the security landscape, and what IT professionals need to know to stay abreast of the new approaches.

TechRepublic: What are the major goals of CSA's Big Data Analytics subgroup?

Alvaro Cardenas: One of the goals we wanted to achieve is to understand how Big Data is different from traditional data in the tools it provides to us. Big Data is now a hot topic from a business perspective, bit it is hard for a consumer to identify what is specifically unique about Big Data. One of our main goals then was to differentiate Big Data, and then to exemplify how Big Data is helping security in ways that other technologies were not able to do previously, things we were not able to solve before.

Traditional vs. Big Data analytics

TechRepublic: What are the differences between traditional and Big Data analytics?

Alvaro Cardenas: The differences are being driven by technology, such as the Hadoop framework and the ecosystem around it for batch processing, stream processing, processing data in motion for stream computing. These frameworks and the commoditization of the data warehouses can actually now produce a big cluster of computers, managed efficiently and cheaply. This is the way we can now approach this problem. Before only credit card companies or telephone companies were able to invest enough to have these Big Data warehouses and collect and analyze historical long-term trends and correlations. But nowadays these technologies are pretty much available to everyone interested in deploying them. So the big changes are actually being driven by the technology and also access to both software and hardware to manage these large-scale information processing tasks.

Section 3.0 of the report proposes the following evolution for data security analytics:

  • 1st Generation: Intrusion Detection Systems
  • 2nd Generation: Security Information and Event Management (SIEM). Also called "1st Generation SIEM"
  • 3rd Generation: Big Data Analytics in Security. Also called "2nd Generation SIEM"

The progression of data security analytics

TechRepublic: The third section of your report provides a progression of data security analytics from the legacy perimeter approach to Big Data capabilities. How would you describe these developments?

Alvaro Cardenas: The first generation (intrusion detection systems) came to a close when people realized that fully protecting a system wasn't possible. There's no way to perfectly protect a system or protect it from attacks. People have been working on intrusion detection systems for almost three decades. The first ones were very specific, very targeted -- sort of like extensions of your firewall in a way. You would create signatures, and you would look specifically for something malicious that could be happening. They were very specific signatures, and you would keep track of infections and detect intrusions. One problem we realized (with first generation intrusion detection) was we were generating a lot of false alarms.

So there's a variety of information that needs to be aggregated and correlated, and that's what the second generation (SIEM) does, dealing with the correlation of data and the false alarms. Eventually what they were doing was allowing a centralized security operations center where people would see in a dashboard all of the security indicators of your network. It would allow users to collect analytics and data on trends so for example all the data analytics that were created with this information.

We are at the birth of this third generation, what some people are calling "second generation" SIEM. The problem with first generation SIEM is that it does not scale very well. Sometimes you have to delete data or keep data in different schema to allow them into the databases. So they don't scale very well, and they don't allow you to incorporate several streams of data, things that they didn't think they had to use when they designed the system. We are now, for example, monitoring websites and text-based tweets or the message of emails. Unstructured data is very difficult to capture with first-gen SIEM technologies. One of the advantages of Big Data and NoSQL databases is that they can store these data in a format that is scalable and at the same time they allow you to create queries to understand the data better. So we found out that this is the promise of Big Data, moving security information and monitoring to the next level.

Big Data security analytics: what you need to know

TechRepublic: Pretend I am a CISO at a Fortune 1000 company. What are the most important things I need to know about Big Data security analytics in order to stay on top of the technology?

Alvaro Cardenas: Big Data enables various capabilities, for instance, forensics and the analysis of long-term historical trends. By collecting data on a large scale and analyzing historical trends, you would be able identify when an attack started, and what were the steps that the attacker took to get ahold of your systems. Even if you did not detect the original attack in your systems, you can go back and do an historical correlation in your database and systems to identify the attack. So long-term historical analysis is one advantage.

Another is the efficiency of queries. So when you want to understand your data, Big Data allows you to carry out complex queries and receive results in a timely fashion.

Finally, I think it's the difference in technology between batch processing and streaming data. You probably need both in your systems. Streaming data is just analyzing data online without doing these historical correlations, just streaming the traffic. So this would be a tool to identify more pressing attacks that appear suddenly, whereas batch processing is better for analyzing long-term trends. So those are some of the concepts that I think are important.

Wilco van Ginkel: If I can add a little something. This is indeed a great discussion to have, but there's one question that has to be asked first, that's what I would do if a CISO approached me. That is: What is the concern of the CISO at the moment, and how can Big Data analytics help him or her? It's not so much do we need to jump on the Big Data analytics bandwagon, yes or no? The question is: Why should I? So is there anything in the risk profile or the security status of the company that bothers him or her? If so, let's talk about Big Data analytics as a potential solution. If yes, then we have the discussion. There's a lot of talk about Big Data, but it's in such an embryonic stage that a lot of companies, just like with cloud five years ago, jump on it without understanding why they do it in the first place, and what's the benefit. That would be my first discussion with a CISO.

A lot of companies are just scratching the surface of Big Data analytics. They're doing small proofs of concept. They're trying to jump from, say, descriptive analytics, which is all about what's happened in the past, to prescriptive analytics, what's going to come. But it's all very small scale -- they're just trying to get their head around it and the data. So it's really in its infancy stages, there's no change in the industry. It's coming from analytical startups especially focused on security analytics. From the startup market, you see them trying to tackle security analytics from a different angle. In the enterprise market, a lot of companies are thinking about it, but don't have a real idea about it, with a few exceptions. Again, they're just trying to figure out if it's really something for them and what is the value-add.

Thank you to Alvaro Cardenas and Wilco van Ginkel for making time for this interview.