Show of hands, how many know to inform their bank that they will be traveling abroad?

I first learned why this was necessary several years ago in Copenhagen, when a polite car rental agent informed me my bank credit card did not work. Flustered and embarrassed, I tried a different credit card, that one worked. Already late, I put aside trying to figure out what happened.

That evening, I called my bank. A cheerful customer service representative reminded me it was my responsibility to inform the bank when I travel outside the US. I did not know it then, but that was my first encounter with an anomaly-detection system.

Why did the bank freeze my account? My credit card being used in a foreign country was considered a high-risk anomaly. With all the data breaches and stolen credit/debit card information traversing the internet, banks are being careful.

Curious about being an anomaly, I decided to learn how a computing system could get that all-knowing and powerful. I have since read numerous books and papers about the subject, each jammed full of great information if one happened to have a math degree. True confession: probability and statistics were not my strong suit in college.

A few months ago, a colleague who knew of my interest in anomaly detection mentioned I should read Practical Machine Learning, A New Look at Anomaly Detection, which is an eBook coauthored by Ted Dunning and Ellen Friedman and published by O’Reilly. My friend was right; the book was helpful. As the title predicted, the book presented content in a practical fashion, which in my case translated into understanding.

How does anomaly detection work?

First let’s define anomaly. American Heritage Dictionary describes anomaly as a deviation or departure from the normal or common order, form or rule. Anomalies have also been called outliers, exceptions or peculiarities. When it comes to information technology, an anomaly detector is a software tool that seeks out abnormal digital entities in computing devices or network infrastructure.

Detecting anomalies is not that difficult once a baseline of what is considered normal has been created. However, there is a complication: how to decide if the detected anomaly is good, bad or indifferent. For example, a detector will flag a new computer as an anomaly. Moreover, it will do so every scan as the new computer is a departure from the normal baseline. So there must be a way to differentiate good unknowns from bad unknowns from indifferent unknowns. That something would be a classifier. A classifier is a machine-learning program, like anomaly detectors, used to categorize anomalies, keep track of them, and update the anomaly detector to avoid unwarranted alerts.

Referring back to my travel example, as soon as I explained my predicament to customer service; the representative reactivated my credit card, shifted my using a credit card in Copenhagen from an unknown anomaly to an acceptable classifier, which in turn configured the bank’s anomaly detector to allow any additional charges I made while in Denmark.

What is significant about anomaly detection is that it takes away the bad guys’ element of surprise. The system is still reactive, but moves way up the curve giving IT departments more of a fighting chance.

Anomaly detection and bank phishing

To show how the bad guys lose their edge, Dunning and Friedman picked one of the most insidious exploits ever devised by the digital underground – a phishing attack on a bank website – and explained how an anomaly detector thwarts the exploit. The eBook mentioned, “It’s not only challenging to think of how to create an effective model and alert system – it’s also a challenge to stay one step ahead of the fraudsters. As you find ways to foil their attacks, they keep looking for new ways to commit theft.”

The whole point of this particular phishing attack is to steal log-in information from bank customers visiting what they assume is their bank’s website, and not the malicious website that it really is. Before explaining how anomaly detection prevents bad actors from stealing their victim’s money, let’s step through the attack process shown in the slide below.

Step one: Attackers spam customers of the bank they are targeting. The attackers used one of any number of phishing ploys hoping a significant number of spam recipients would click on the email’s active link.

Step two: Most know better than to follow the link, deleting the email instead. However, there are always some who do. Those phished end up at a malicious replica of the banking site and are asked to type in their login information as well as respond to the CAPTCHA.

Step three: Whether the victims typed in the correct CAPTCHA or not, each victim’s web browser is redirected to a page asking to reenter the CAPTCHA. While the victim was complying, the attacker’s fraud-bot script absconded with the victim’s log-in credentials.

Step four: This time the CAPTCHA worked. The victim is now viewing the real banking website, logged in, and none the wiser of what happened.

Step five: With the log-in information in hand, the bad actors can now access each victim’s accounts, withdraw money, write checks, etc.

How does an anomaly-detection system stop the attack?

The eBook mentioned that all log-in attempts are recorded by web servers, and could be used by an anomaly-detection program to capture the pattern abnormalities shown in the following slide.

The authors explained further, “Because the fraud-bot script is forced to use actual image elements from the real bank site on the decoy site, there are, in fact, two sets of image downloads, plus the two log-in events (human and bot), on the same timeline.”

Once detected, human intervention is needed. Bank employees must place the account on hold and notify the victimized customer. If notification occurs soon enough, the attackers will lose their chance to steal funds from the account. The authors said, “The need for rapid response is one reason building the detector on a system with a real-time file system is important.”

Probabilistic models

Ironically, the part I did not understand was the key to getting an efficient anomaly-detection system to work. Once again the authors, “The common theme of all these anomaly detectors is that they use a probabilistic model of the data from the past. The log of the probability value that is produced by these models can be used to automatically set a threshold that, when exceeded, sets off an alarm.”