Automatic Labelling and Document Clustering for Forensic Analysis
In computer forensic analysis, retrieved data is in unstructured text, whose analysis by computer examiners is difficult to be performed. In proposed approach the forensic analysis is done very systematically i.e. retrieved data is in unstructured format get particular structure by using high quality well known algorithm and automatic cluster labeling method. Indexing is performed on txt, doc, and PDF file which automatically estimate the number of clusters with automatic labeling to it. In the proposed approach DBSCAN algorithm and K-mean algorithm are used; which makes it very easy to retrieve most relevant information for forensic analysis also the automated methods of analysis are of great interest.