Date Added: Jan 2010
Nowadays there is a significant amount of data mining work performed outside the DBMS. This paper discusses recommendations to push data mining analysis into the DBMS paying attention to data preprocessing i.e. data cleaning, summarization and transformation, which tends to be the most time-consuming task in data mining projects. This paper present a discussion of practical issues and common solutions when transforming and preparing data sets with the SQL language for data mining purposes, based on experience from real-life projects.