The melding of big data, predictive analytics, and threat intelligence may be just what the doctor ordered to cure the internet of digital threats. To get an idea of what can be done when these components are coalesced into a powerful tool, check out the Live Threat Map created by Norse Corporation, as seen above. Every 5 seconds, 8 million Norse-controlled honeypots are polled for threat intelligence. Then the data is analyzed, manipulated, and presented to the Live Threat Map web application.

Jeff Harrell, senior director of product marketing for Norse, said that he was surprised that less than 1% of the daily total of captured threat intelligence is displayed on the map. Any more would make it unreadable. Harrell said the daily total of collected threat data is fast approaching 130 terabytes, with an accumulated total of over 6 petabytes in their system.

DarkWatch uses Norse’s worldwide infrastructure

Although the Live Threat Map is nice, Norse’s major objective is to collect threat intelligence from the internet, convert that data via big data predictive analysis and adapt the data to machine-readable threat intelligence usable by Norse clients.

Norse released DarkWatch this week. DarkWatch is an attack intelligence offering available as either a virtual or hardware appliance that in addition to current threat detection will also detect what Norse calls, “virtualization-evading malware attacks using cloud vectors.”

Norse also announced a privately-funded study to identify new Middle East state-sponsored cyber activity directed against US infrastructure. Some of the key findings of the study were uncovered during testing of the new Norse DarkWatch appliance and associated technologies. The study will be published later this year.

How the Live Threat Map works

Knowing how long simple web pages take to load, I was curious as to how Norse’s Live Threat Map could display near real-time information with first having to compile that much data from around the world, manipulate it into a useable format, and send it to the web server hosting the Live Threat Map. Harrell said the Norse platform included 16 core routers located on Tier 1 fiber network rings that direct the collection of data from over 150 locations in more than 40 countries.

Harrell said, “The platform has access to approximately 16 million IP addresses spread across every aspect of the IPV4 space to facilitate the collection of threat data. The captured information is then fed to Graphic Processing Unit (GPU) calculation clusters in 40 network operating centers around the globe enabling data collection, analysis, and delivery of intelligence.”

Harrell said, “Norse has built GPU clusters with the processing power of a supercomputer offloading the data processing from the computer onto these clusters. That’s how we can deliver threat intelligence to our customers within five seconds of it happening.”

Harrell also mentioned the Live Threat Map relied on information from honey pots. Harrell said a good analogy is comparing it to fishing and putting out an inviting computing device in hopes that a bad guy attacks it. Norse tries to find honeypot locations where major internet trunk lines intersect. That is where the most data flows, and inevitably where bad actors congregate.

Harrell said, “Honeypots support the emulation of thousands of networks and applications in order to appear as desirable targets for malware, bots, and hackers.” The proprietary honeypot is capable of facing the internet with millions of different appearances such as users, servers, and network devices. The honeypot could mimic a bank, Microsoft Exchange server, Linux web servers, and just about every computing device that is in existence. For example, honeypots simulating client computers could be configured to emulate browser-based actions causing compromised websites to reveal their malware.

Other threat sensors

Honeypots are not the only method used by Norse to gather threat intelligence. As the slide above depicts, Norse uses a wide variety of sensors and tactics to gain threat intelligence. Harrell explained how each is used:

Internet Relay Chat (IRC): Bad actors use IRC to exchange ideas and plans. By participating in these chats, researchers are able to gain intelligence on new and modified exploits.

Border Gateway Protocol (BGP): The Internet Assigned Numbers Authority (IANA) is responsible for the global coordination of the DNS Root, IP addressing, and other Internet protocol resources. By maintaining current copies of this information, Norse can tell if an IP address is valid or fake, and if a valid IP address has been hijacked or spoofed.

Peer-to-Peer (P2P): Participants interested in communicating without detection often use P2P connections. By also participating in P2P groups, researchers gain valuable insight on possible new attacks or modifications to ongoing attacks.

Crawlers: Norse developed proprietary crawlers that search text or documents for indicators of potential malicious behavior or leaked confidential information including data indicating threats or compromises.

Anonymous Proxies: While originally designed to protect the innocent, anonymous proxies are now used to launch and mask cyber attacks. Real-time monitoring and detection of new unpublished anonymous proxy exit nodes allow researchers to raise red flags about impending attacks.

Open source: Norse does a few unique things with open-source applications on their honeypot network:

  • Emulate popular applications that more often than not are unsecured. This attracts bad guys that end up divulging their attack techniques.
  • Offer free non-logging DNS services attracts those who do not want to be detected. This information along with other data might generate an alert in time to warn potential victims.

Last thought

Harrell said China is the country that originates the most attacks, with the US a close second. The US was the number one receiver of attacks. Interestingly, Harrell added it is common knowledge the high number of attacks emanating from China is because so many Chinese computers are easy to compromise, since computers are seldom new or have up-to-date operating systems.