Cloudpress 2.0: A MapReduce Approach for News Retrieval on the Cloud
In this era of the Internet, the amount of news articles added every minute of every day is humongous. As a result of this explosive amount of news articles, news retrieval systems are required to process the news articles frequently and intensively. The news retrieval systems that are in-use today are not capable of coping up with these data-intensive computations. Cloud press 2.0 presented here, is designed and implemented to be scalable, robust and fault tolerant. It is designed in such a way that, all the processes involved in news retrieval such as fetching, pre-processing, indexing, storing and summarizing, exploit Map Reduce paradigm and use the power of the Cloud computing.