Similarity Detection in Source Code Using Data Mining Techniques

With the advent of the Internet, the great big world has but shrunk into a single global village. Resources, millions of them, are just a click away from any user, wherever one may be present physically. With this great luxury also come the shades of grey too. Plagiarism is one such, which is being rampant in the present days to a very high degree. In this paper, the authors present a study of three techniques, Jaccard Similarity (JS), Cosine Similarity (CS) and Jaccard Similarity with Shingles, with respect to source code plagiarism and compare the various results obtained.

Provided by: EuroJournals Topic: Big Data Date Added: Oct 2011 Format: PDF

Find By Topic