Binary Information Press
Sequential pattern mining is an important method in data mining. Traditional mining algorithms are not adapted to the fast, unlimited, continuous and dynamic data stream because they are multiple pass in scanning database. Some approximate sequential pattern mining algorithms are proposed recently which cost too many system resources in sequence compare process. A sequential compare method based on Levenshtein-Automata is proposed in this paper. This method build state conversion model with pretreatment which can finish computing the sequences' similarity in linear time. A combination of Levenshtein-Automata computation and common computation of edit distance is presented in allusion to the Levenshtein-Automata's problem of using too much memory, so a tradeoff between time cost and space cost is implemented.