C-Rank: A Link-based Similarity Measure for Scientific Literature Databases
As the number of people who use scientific literature databases grows, the demand for literature retrieval services has been steadily increased. One of the most popular retrieval services is to find a set of papers similar to the paper under consideration, which requires a measure that computes similarities between papers. Scientific literature databases exhibit two interesting characteristics that are different from general databases. First, the papers cited by old papers are often not included in the database due to technical and economic reasons. Second, since a paper references the papers published before it, few papers cite recently-published papers.