Detecting and Measuring Similarity in Code Clones

Free registration required

Executive Summary

Most previous work on code-clone detection has focused on finding identical clones, or clones that are identical up to identifiers and literal values. However, it is often important to find similar clones, too. One challenge is that the definition of similarity depends on the context in which clones are being found. Therefore, the authors propose new techniques for finding similar code blocks and for quantifying their similarity. The techniques can be used to find clone clusters, sets of code blocks all within a user-supplied similarity thresh-old of each other. Also, given one code block, they can find all similar blocks and present them rank-ordered by similarity. The techniques have been used in a clone-detection tool for C programs.

  • Format: PDF
  • Size: 144.6 KB