Scalable Regression Tree Learning on Hadoop Using OpenPlanet
Source: University of Southampton
As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, the authors describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework using a hybrid approach.
| Format: | Size: | 1464.32 | |
| Date: | Apr 2012 |



