CloudVista: Interactive and Economical Visual Cluster Analysis for Big Data in the Cloud
With the development and deployment of ubiquitous information-sensing mobile devices, wireless sensor networks, RFID readers, simulation, and software logs, big data (e.g., terabytes to petabytes) have become normal in many business and scientific applications. Because data analysis is often iterative and exploratory, big data brings significant challenges. Analysis of big data has become an important problem for many business and scientific applications, among which clustering and visualizing clusters in big data raise some unique challenges. This paper presents the CloudVista prototype system to address the problems with big data caused by using existing data reduction approaches. It promotes a whole-big-data visualization approach that preserves the details of clustering structure. The prototype system has several merits.