Investigating the Change of Web Pages' Titles Over Time

Date Added: Jun 2009
Format: PDF

Inaccessible web pages are part of the browsing experience. The content of these pages however is often not completely lost but rather missing. Lexical Signatures (LS) generated from the web pages' textual content have been shown to be suitable as search engine queries when trying to discover a (missing) web page. Since LSs are expensive to generate, the authors investigate the potential of web pages' titles as they are available at a lower cost. They present the results from studying the change of titles over time. They take titles from copies provided by the Internet Archive of randomly sampled web pages and show the frequency of change as well as the degree of change in terms of the Levenshtein score.