PhishGILLNET - Phishing Detection Using Probabilistic Latent Semantic Analysis
Identity theft is one of the most profitable crimes committed by felons. In the cyber space, this is commonly achieved using phishing. The authors propose here robust server side methodology to detect phishing attacks, called phishGILLNET, which incorporates the power of natural language processing and machine learning techniques. phishGILLNET is a multi-layered approach to detect phishing attacks. The first layer (phishGILLNET1) employs Probabilistic Latent Semantic Analysis (PLSA) to build a topic model. The topic model handles synonym (multiple words with similar meaning), polysemy (words with multiple meanings), and other linguistic variations found in phishing.