Ward off morphing malware and other attacks with machine learning

It's getting tougher to ward off blended threats and intrusions as system attacks become more sophisticated. Fight fire with fire and pit compute cycles against compute cycles.



Blended threats, silent intrusions, zero day attacks and morphing malware are common problems for enterprise IT managers. Attacks and compromises are escalating and the technology behind these attacks is become more sophisticated. Nevertheless, those in charge of enterprise IT security are still responsible for mitigating any and all attacks before damage is done, regardless of the circumstances.

Yet, fighting the latest threats has become an almost impossible chore, simply because attackers have turned to the power of the CPU and are creating learning algorithms that can leverage large amounts of data to uncover zero day vulnerabilities. In other words, the signature-based technologies that are in use will be hard pressed to fight the threats of the future.

That situation creates a conundrum for most IT security managers. However, there are some who are thinking ahead of the curve and are leveraging the power of the CPU to fight threats actively and in real time. Those individuals are unlocking the power of machine learning to combat the threats created by attackers leveraging powerful algorithms.

Machine learning as a methodology

Machine learning for computer security has quickly become an established methodology for protecting systems from the threats of today and tomorrow. Case in point is the founding of the Machine Learning and Computer Security Research Institute (MLSEC.ORG), which offers open source algorithms that can be used to detect anomalies and be used against source code to uncover potential vulnerabilities. What’s more, MLSEC is backed by the Computer Security Group at the Institute of Computer Science at the University of Göttingen, in Göttingen, Germany. The institute also offers backgrounders, whitepapers, research and other publications on machine learning algorithms for computer security.

While MLSEC proves to be mostly academic in nature, the institute does offer a viable starting point for IT administrators looking to better grasp the ideologies around machine learning. However, the biggest value comes from the vendors who adopt the ideologies presented and build products for deployment that leverage machine learning technologies. Regrettably, those vendors seem far and few between, amounting only to a few that have implemented some form of artificial intelligence to combat threats.

Deep Packet Inspection

However, some firewall and security appliance vendors are moving beyond Deep Packet Inspection (DPI) technologies and are including machine learning capabilities into their latest offerings. Although DPI solutions have historically been the leading cyber security technology, there is a significant weakness – DPI requires that analysts manually reverse-engineer applications to build policies that control traffic across the network, relying on packet payload signatures.

Simply put, if DPI is faced with unidentified traffic, it can fail to block attacks, at least until some manual reverse engineering is performed by skilled analysts to identify and generate a signature from a new application, and then appropriately classify it. In the meantime, the unidentified application continues to run on the network, compromising security and operational efficiency.

Fortunately, machine learning has advanced the capabilities of DPI systems by automating the discovery of enterprise applications and automatically creates signatures and builds white lists. Machine learning also adds context to Internet traffic based on an algorithm based understanding of relationships among data. Simply put, machine learning can eliminate the need for highly skilled and paid analysts.

However, it is not simple to replace humans with machine learning algorithms for application signature detection, decoding and classification. Machines require advanced analytics, which are critical to executing the auto-discovery process for dealing with any unknown traffic on the network at any instant.

That network traffic might be unknown for a variety of reasons, including:

  • A never-before-seen network protocol or user application
  • Changes to known protocols or applications
  • Addition of new services (cloud, hybrid or otherwise)
  • Infrastructure changes
  • Additional sites, users or other access changes

By automating the detection of those changes and determining what normal traffic is over a period of time, systems are able to learn what is normal and what is anomalous, all without human intervention. That level of automation allows machine learning techniques to learn new signatures, their nature, and their associated evolution over time. The critical proof point becomes a situation where automated analysis can determine the five “W's”, which prove to be the most critical elements for garnering understanding of applications and what the appropriate usages profile is. Those five “W's” include:

  • "What" (type of application)
  • "Why" (purpose of the application)
  • "Who" (the application owner/users)
  • "Where" (network addresses involved with these applications)
  • "When" (point at which new control policies are required to be enforced)

Once systems can effectively deal the five “W's”, machine learning can be achieved and expanded upon to deliver real-time protection, all without relying on human intervention. With that in mind, it becomes easy to see why the artificial intelligence offered by machine learning will become a standard for security systems in the not too distant future.