National University of Defense Technology
Sequence classification is an important task in data mining. The authors address the problem of sequence classification using rules composed of interesting itemsets found in a dataset of labeled sequences and accompanying class labels. They measure the interestingness of an itemset in a given class of sequences by combining the cohesion and the support of the itemset. They use the discovered itemsets to generate confident classification rules, and present two different ways of building a classifier. The first classifier is based on the CBA (Classification Based on Associations) method, but they use a new ranking strategy for the generated rules, achieving better results. The second classifier ranks the rules by first measuring their value specific to the new data object.