Scalable Semantics-Based Detection of Similar Android Applications
The popularity and utility of Smartphone's rely on their vibrant application markets; however, plagiarism threatens the long-term health of these markets. In this paper, the authors present a scalable approach to detecting similar Android apps based on semantic information. They implement their approach in a tool called AnDarwin and evaluate it on 265,359 apps collected from 17 markets including Google Play and numerous third-party markets. In contrast with earlier approaches, An-Darwin does not compare apps pair-wise, thus greatly increasing its scalability. Additionally, AnDarwin does not rely on information other than the app code - such as the app's market, signature, or description - thus greatly increasing its reliability.