WebContent: Efficient P2P Warehousing of Web Data
Source: VLDB Endowment
The authors present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as Web services, into a large-scale efficient platform. Calls to various services are combined inside ActiveXML documents, which are XML documents including service calls. An ActiveXML optimizer is used to: Efficiently distribute computations among sites; perform XQuery-specific optimizations by leveraging an algebraic XQuery optimizer; and given an XML query, chose among several distributed indices the most appropriate in order to answer the query.