How big data can help keep enterprise networks secure

Data can prevent cyberattacks by rolling out new models to stop imminent threats, and by quickly testing controls against historical data to ensure minimal business impact, said MapR's John Omernik.

How big data can help keep enterprise networks secure

TechRepublic's Dan Patterson talked with John Omernik, a distinguished technologist at MapR Data Technologies, maker of AI and analytics platforms

Patterson: Data is undeniably the new oil and all professionals need to have some sort of proficiency with data but security professionals especially need to understand how data can create vulnerabilities and how data can help keep your enterprise company secure... I wonder if we can start with the basics. When we say data and when we say help keep companies secure, what kind of data are we talking about?

Omernik: Well the enterprise produces data from a lot of different sources whether it's from firewalls, whether it's from DNS servers, whether it's from web servers or customer transaction logs. And all these different data sources combine together to give a practitioner of InfoSec a complete view of what's happening on their network.

And that's why putting together these data stacks regardless of their source is critical to identifying what's normal on a network and what's abnormal, i.e. the threats that are facing the enterprise today.

Patterson: Yeah. So what types of tools ... I assume we're talking about machine learning, but what types of tools specifically do data professionals need to be proficient in or at least aware of to help them monitor networks?

Omernik: So, information security practitioners are a funny bunch and I count myself among that group. We are essentially data scientists. We've been looking at data since we started our career trying to understand it and then in a sense, to provide business to the enterprise.

Our value add is stopping threats, protecting customers and protecting the intellectual property of our organizations. So we have that and data scientists since the start. What we need to be able to do now is to start to identify those skill sets and tools that data scientists just use as part of their day to day activities and say, wait a minute, maybe we should be learning those tools as well.

Tools like no sequel tools like MongoDB and MapRDB and other tools out there that allow us to quickly synthesize that data and translate that into real time models to protect our networks.

Patterson: What about data retention? We're now living in a post-GDPR world, and of course more data helps networks and helps network analysts make better decisions. But we now live in this age of kind of balancing user privacy with data retention and user security. So how do we make that delicate balance?

Omernik: Yeah. This is a really delicate balance and I'm glad you brought this up because there's customer data and those are the direct pieces of data that are covered under a law like GDPR. And while the law spells out how data practices and privacy need have to happen, the actual execution and implementation of law will take a few more months if not years to really delve into how that's going to effect companies.

Now there are other data sets within an organization that are not customer data. And retention on those are going to be very different that what the GDPR laws cover. What you need is a tool really to be able to handle flexibly what data you can retain short term that GDPR covers and other data sets that you can work long term on. And make sure that under the hood, that you have good data governance practices that ensure the pruning of the covered data and ensure that practitioners get their hands on and utilize all of the data as its available.

SEE: Special report: Turning big data into business insights (free PDF) (TechRepublic)

Patterson: What about trends like DevOps? Although I shouldn't really say it's a trend. DevOps has been used by IT and biz type professionals for years. But how can DevOps help with that balance between security and privacy? And what DevOps advice do you have for those that want to make sure that they are getting the most out of their data?

Omernik: Yeah. Well first of all to my fellow InfoSec practitioners, you need to learn the modern DevOps, if for nothing else than to be able to secure it. You can't secure something you don't understand. So modern DevOps practices, using containers, Docker, et cetera.

Orchestrators like Kubernetes, and Mesos, and Docker, you need to understand what those technologies do to deploy code on your network, how you can ensure that they're not deploying vulnerabilities on your network and how you can move forward in securing that aspect of your enterprise.

Now I actually think that InfoSec practitioners should not only learn but they should embrace those DevOps practices because they allow ... These practices allow an InfoSec practitioner to move quickly, to say, hey we are now needing to deploy a model, a signature rule, something in our network to protect our customers, to protect our network.

We need to do it fast. Well, these DevOps practices helps to ensure that you can be fast and you can be safe. So you're not breaking other models that are already in production or stomping on the toes of the resources used by other users of your network.

So these are important things that can help an InfoSec practitioner move faster. But just in general I think practitioners need to learn and embrace these tools so they can add the security flavor to what's happening in their network.

Patterson: John, that's great advice. I wonder if you can leave us with a forecast, maybe looking in the next say 6, 18, 36 months, in terms of not just security trends but emerging big data trends that could help security professionals stay, as you say, nimble and keep networks secure.

SEE: 27 ways to reduce insider security threats (free PDF) (TechRepublic)

Omernik: Right. Really the advanced streaming architecture I think is something that InfoSec practitioners need to embrace. I think in the past few years the Hadoop models have allowed batch work. SIMS, the Security Information that monitors have a lot of pointed solutions to InfoSec.

Hadoop will let you do a more batch level and a more holistic approach. I think as InfoSec practitioners grow, event-driven architectures are going to be critical because let's just talk about a bank. You have bigger branches. You have multiple service centers.

You have to be able to aggregate and look at your threat profile across multiple different locations. And I guess the event-driven architecture allows you to do that while also being real time because these threats are going to get faster and faster and faster.

And if we have to work on hour batch windows where a threat might occur and data may be ex-filtrated from your organization, but it happened an our ago, that's not going to be acceptable to the business. We need to start bringing up real time. This is happening. We need to take action now.

And that's I think the event-driven architectures, based on a strong data platform and using DevOps, those things all combined will make information security done with data even more nimbler in the future.

Also see: