Date Added: Jul 2010
Provisioning and maintenance of infrastructure for Web based digital library search engines such as CiteSeer present several challenges. CiteSeer provides autonomous citation indexing, full text indexing, and extensive document metadata from documents crawled from the web across computer and information sciences and related fields. Infrastructure virtualization and cloud computing are particularly attractive choices for CiteSeer, which is challenged by both growth in the size of the indexed document collection, new features and most prominently usage. In this paper, the authors discuss constraints and choices faced by information retrieval systems like CiteSeer by exploring in detail aspects of placing CiteSeer into current cloud infrastructure offerings.