论文标题
有效地压缩长任意序列,没有编码器的参考
Efficient Compression of Long Arbitrary Sequences with No Reference at the Encoder
论文作者
论文摘要
在分布式信息应用程序中,编码器压缩了任意向量,而解码器作为侧面信息也可以使用类似的参考向量。对于锤距离距离的相似性度量,并且需要保证完美的重建,我们为解决这个问题提供了两个贡献。结果表明,当编码器可以使用一组潜在的参考向量时,当该集合满足某个集群属性时,可以达到较低的压缩率。另一个结果通过一般的内部coset代码和外部错误校正校正代码的一般串联来降低了最著名的解码复杂性,从向量长度$ n $中的指数$ n $中的指数降低到$ o(n^{1.5})$。结果的一种潜在应用是DNA序列的压缩,其中发件人和接收器之间共享相似(但不相同)的参考矢量。
In a distributed information application an encoder compresses an arbitrary vector while a similar reference vector is available to the decoder as side information. For the Hamming-distance similarity measure, and when guaranteed perfect reconstruction is required, we present two contributions to the solution of this problem. One result shows that when a set of potential reference vectors is available to the encoder, lower compression rates can be achieved when the set satisfies a certain clustering property. Another result reduces the best known decoding complexity from exponential in the vector length $n$ to $O(n^{1.5})$ by generalized concatenation of inner coset codes and outer error-correcting codes. One potential application of the results is the compression of DNA sequences, where similar (but not identical) reference vectors are shared among senders and receivers.