论文标题
DTW距离下的模式匹配
Pattern matching under DTW distance
论文作者
论文摘要
在这项工作中,我们考虑了在动态时间扭曲(DTW)距离下的模式匹配问题,这些距离是由第三代测序分析生物数据分析中的潜在应用所激发的。要测量两个字符串之间的DTW距离,一个必须“翘曲”它们,即在字符串中加倍一些字母以获得两个相等的长度字符串,然后在相应位置中总结字母之间的距离。当字母之间的距离是整数时,我们表明,对于具有M运行的模式P和n运行的文本t:1。有一个O(m + n) - 时间算法,该算法计算所有位置的所有位置,其中dtw距离从p到t的距离最多为1; 2。有一个O(kmn) - 时间算法,该算法计算所有位置的DTW距离从P到T的距离最多为k。作为第二个结果的推论,我们还得出了字母内通用指标的近似算法。
In this work, we consider the problem of pattern matching under the dynamic time warping (DTW) distance motivated by potential applications in the analysis of biological data produced by the third generation sequencing. To measure the DTW distance between two strings, one must "warp" them, that is, double some letters in the strings to obtain two equal-lengths strings, and then sum the distances between the letters in the corresponding positions. When the distances between letters are integers, we show that for a pattern P with m runs and a text T with n runs: 1. There is an O(m + n)-time algorithm that computes all locations where the DTW distance from P to T is at most 1; 2. There is an O(kmn)-time algorithm that computes all locations where the DTW distance from P to T is at most k. As a corollary of the second result, we also derive an approximation algorithm for general metrics on the alphabet.