Preprocessing DNS Log Data for Effective Data Mining
Source: Northeastern University
The Domain Name Service (DNS) provides a critical function in directing Internet traffic. Defending DNS servers from bandwidth attacks is assisted by the ability to effectively mine DNS log data for statistical patterns. Processing DNS log data can be classified as a data-intensive problem, and as such presents challenges unique to this class of problem. When problems occur in capturing log data, or when the DNS server experiences an outage (scheduled or unscheduled), the normal pattern of traffic for that server becomes clouded. Simple linear interpolation of the holes in the data does not preserve features such as peaks in traffic (which can occur during an attack, making them of particular interest).