Networking

Large Scale Metadata Harvesting Over Low Bandwidth Connections

Date Added: Mar 2010
Format: PDF

There seems to be a widespread perception that large scale metadata harvesting requires a large amount of bandwidth. In this paper, a simple Python-based metadata harvester was created and run over a residential broadband connection. Results show that it is possible to build a metadata collection in the order of millions of records in just a few days over such connections. There seems to be widespread perception that large scale metadata harvesting requires a large amount of bandwidth and that a fast Internet connection is essential for such an endeavour.