Discovering Longest-lasting Correlation in Sequence Databases
Most existing paper on sequence databases use correlation (e.g., Euclidean distance and Pearson correlation) as a core function for various analytical tasks. Typically, it requires users to set a length for the similarity queries. However, there is no steady way to define the proper length on different application needs. In this paper the authors focus on discovering longest-lasting highly correlated subsequences in sequence databases, which is particularly useful in helping those analyses without prior knowledge about the query length. Surprisingly, there has been limited work on this problem.