S2MP: Similarity Measure for Sequential Patterns
In data mining, computing the similarity of objects is an essential task, for example to identify regularities or to build homogeneous clusters of objects. In the case of sequential data seen in various fields of application (e.g. series of customer's purchases and Internet navigation) this problem (i.e. comparing the similarity of sequences) is very important. There is already some similarity measures as edit distance and LCS (Longest Common Subsequence) suited to simple sequences, but these measures are not relevant in the case of complex sequences composed of sets of items, as is the case of sequential patterns.