International Journal on Computer Science and Technology (IJCST)
Most of the existing clustering approaches concentrate on purely numerical or categorical data only, but not the both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. In this paper, a method based on the ideas to explore the relationship among categorical attributes' values is presented. This method defines the similarity among items of categorical attributes based on the idea of co occurrence. All categorical values will be converted to numeric according to the similarity to make all attributes contain only numeric value.