Science & Engineering Research Support soCiety (SERSC)
Software fault prediction models using supervised learning cannot be applied when training data are not present. In this case, new models using unsupervised learning such as clustering algorithms are quite necessary. Nevertheless, there exist very few studies about unsupervised models because it is difficult to construct the models. One of the difficulties is to decide the number of clusters. To solve this problem, the authors build unsupervised models using clustering algorithms, EM and X-means, which determine the number of clusters automatically and compare them with results of earlier papers.