Distributed Data Clustering in Sensor Networks
Low overhead analysis of large distributed data sets is necessary for current data centers and for future sensor networks. In such systems, each node holds some data value, e.g., a local sensor read, and a concise picture of the global system state needs to be obtained. In resource-constrained environments like sensor networks, this needs to be done without collecting all the data at any location, i.e., in a distributed manner. To this end, the authors address the distributed clustering problem, in which numerous interconnected nodes compute a clustering of their data, i.e., partition these values into multiple clusters, and describe each cluster concisely.