Quantifying the Accuracy of the Ground Truth Associated With Internet Traffic Traces
Ground truth information for Internet traffic traces is often derived by means of port analysis and payload inspection (Deep Packet Inspection - DPI). In this paper, the authors analyze the errors that DPI and port analysis commit when assigning protocol labels to traffic traces. They compare the ground truth provided by these approaches with that derived by gt, a tool that they developed, which provides error-free ground truth at the application level by construction. Experimental results demonstrate that, depending on the protocols composing a trace, ground truth information from port analysis and DPI can be incorrect for up to 91% and 26% of the labeled bytes, respectively.