Association for Computing Machinery
The motive for the BID data suite is exploratory data analysis. Exploratory analysis involves sifting through data, making hypotheses about structure and rapidly testing them. This paper describes the BID Data Suite, a collection of hardware, software and design patterns that enable fast, large-scale data mining at very low cost. By co-designing all of these elements the authors achieve single-machine performance levels that equal or exceed reported cluster implementations for common benchmark problems. A key design criterion is rapid exploration of models; hence the system is interactive and primarily single-user.