International Journal of Computer Applications
Integration of data sources to build a Data Warehouse (DW), refers to the task of developing a common schema as well as data transformation solutions for a number of data sources with related content. The large number and size of modern data sources make the integration process cumbersome. In such cases dimensionality of the data is reduced prior to populating the DWs. Attribute subset selection on the basis of relevance analysis is one way to reduce the dimensionality. Relevance analysis of attribute is done by means of correlation analysis, which detects the attributes (redundant) that do not have significant contribution in the characteristics of whole data of concern.