How to detect bots: What you need to know

Akamai's CTO discusses why machine learning and cloud are important when it comes to security breaches, IoT-related attacks, and credential stuffing.

How to detect bots: What you need to know

Dan Patterson, a Senior Producer for CBS News and CNET, interviewed Patrick Sullivan, Akamai CTO, Security Strategy,  about how to detect and protect against bots. The following is an edited transcript of the interview.

Dan Patterson: This seems like a fascinating cat-and-mouse game. What are some of the evasion tactics that bot creators or at least bot users deploy? And if I'm an enterprise company, a B2B, or even a political campaign, what are some of the detection methods I might use?

Patrick Sullivan: Catching the really unsophisticated bots, somebody who does not have a great deal of skill, often there are some pretty easy tells there. You may see a small number of IPs generating a huge number of requests--pretty easy to spot those. You can swat those down pretty easily. I think for many years CAPTCHA has been an option for defenders to use. 

At this point, I think this is an area where machine learning on the adversarial side, computers are better at solving CAPTCHA these days than human beings. You can, with a pretty small training set, train a machine to be able to solve a CAPTCHA riddle, and they will become more adept at that than a human being. So that's a defense that presents a high level of friction to an end user and is not terribly effective against an adversary.

SEE: Security Awareness and Training policy (TechRepublic Premium)

We've moved up to looking at things like, if somebody says they're on a MacBook running a Chrome browser, and we really interrogate that and fingerprint that device, can they run things like JavaScript? Can they do things that a normal device would be able to do if it asserts to be who it is? And you can find things there, but the bots tend to clean that up as well. So we've even looked at the TLS signatures. 

As people encrypt communication, there's a two-way negotiation between the client and the server, which ciphers they'll accept in both directions. We've found that you can get some signal there as to whether something is a bot or a human. Then, as soon as you started to exercise that signal, we saw a massive explosion in randomization of the safer suites that people used.

So these days the most effective area is really around machine learning. Figuring out, based on telemetry, mouse movements, the orientation of a phone, that is harder. It takes a lot more work for an adversary to create a synthetic human-like experience in terms of user input/output compared to a bot. So that tends to be the state of the art today.

Dan Patterson: Speaking of the state of the art, what about the cloud? It seems as though if I'm going to buy a bot kit and deploy it, I probably need a platform on which to deploy it. I could maybe build my own server structure, but it's probably a lot easier to use the cloud.

Patrick Sullivan: Absolutely. We see tons of requests coming from the cloud that are bots, and that's true of the good bots and the bad bots. If you think about a lot of the businesses who are providing a service to a website operator, they're building their bots, their automation on the cloud. So, just because the request is coming from the cloud, it doesn't necessarily mean that it's malicious, but it increases your suspicion.

Actually, what we find is part of these bots, and you probably saw this in your exploration, they give you the ability to plug in a network of proxy servers. Really, what happens there is rather than you sending the requests from the bots you operate, you're able to rent time on a massive army of proxy servers and then the request will actually come from those proxy servers. So when we dig through that and figure out where these proxy servers come from, those tend to be home IoT devices that have very poor security that have been compromised by the millions. 

People monetize those by renting time on those devices and that's really helpful for the attacker because they can then reduce the rate of request from a single device. They're coming from a geography that's wherever they would like to be, they can rent proxy servers that are in the home geography of the users, the native users of that website, so it's a very effective evasion that you see used. Almost all of the tools that I'm sure you saw there had some ability to plug in a list of proxy servers.

Dan Patterson: Speaking with you is like talking to the past and connecting it to the future. Just a few years ago, we were having conversations about how home IoT and consumer IoT could be hijacked for automated types of attacks that use the cloud, and now you're telling me that your data shows that this reality has materialized. I think you said that phishing was involved in some of these attacks, is that correct?

Patrick Sullivan: They could be. So when you look at breaches, if you look at the corpus of data of what breaches occurred in 2018 and then what was the root cause, the number one cause of breaches were compromised credentials. There's a number of ways that could happen in a targeted attack, maybe somebody would phish you, get you to go to a website that looks like a facsimile of your login experience for your email. You put in your credentials, somebody grabs those and then now they have your credentials, they take over your business email and then off they go. That's the targeted case.

The more common case we see, particularly in the consumer space, is what we call credential stuffing. Just as you can buy a tool to operate a botnet, you can buy a list of proxy servers, you can buy a list of previously compromised credentials, username passwords. What these bot operators will do is attempt to reuse those credentials en masse all across the web at different sites. 

And there, we saw in the last less than a year and a half, about 55 billion of those attempts, people trying to reuse those credentials to compromise an account. Typically, there's an ecosystem, so once they've compromised the account, they'll hand that off to somebody else in the ecosystem to go actually commit the fraud and the fraud will be different in finance than it is in retail than it is in media than it is in gaming, but there's a pretty clear path to dollars for the attacker in each of those cases.

Also see

20200423-sullivan2-dan.jpg

Akamai CTO Patrick Sullivan

Image: TechRepublic