CADM：神经增强视频流的编解码性扩散建模

论文标题

CADM：神经增强视频流的编解码性扩散建模

CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

论文作者

Zhou, Qihua, Li, Ruibin, Guo, Song, Dong, Peiran, Liu, Yi, Guo, Jingcai, Xu, Zhenda

论文摘要

近年来，互联网视频流量的急剧增长，在该视频中，视频的bitstreams通常会被压缩并以低质量进行压缩，以适合流媒体的上行链路带宽。为了减轻质量降解，它是神经增强视频流（NVS）的兴起，该视频流（NVS）大多通过在媒体服务器上部署神经超级分辨率（SR）来恢复低质量视频的前景。尽管它有益，但我们揭示了当前的主流工作与SR增强的作用并未实现所需的利率 - 延伸权在节省比质量和质量恢复之间取舍，这是由于：（1）过度强调解码器一方的增强，同时省略编码器的共同设计的同时，（2）从编码器中恢复有限的恢复高效详细信息的分配能力有限，并将其恢复为3），并恢复了高度的详细信息（3）优化的（3）优化的（3），3）只有在不考虑颜色位深处的情况下。为了克服这些局限性，我们是第一个通过利用扩散模型固有的视觉生成特性来进行编码器 - 编码器（即编解码器）协同作用的人。具体而言，我们介绍了编解码性扩散建模（CADM），这是一种新型的NVS范式，可显着降低流递送比特率，同时在现有方法上保持更高的恢复能力。首先，CADM通过同时降低视频帧的分辨率和颜色位来提高编码器的压缩效率。其次，CADM通过使转化扩散恢复意识到编码器的分辨率条件来增强解码器的能力。对使用OpenMMLAB基准的公共云服务评估表明，基于常见的视频标准，CADM有效节省了高达5.12-21.44倍的比特率，并且在最先进的神经增强方法上取得了更好的恢复质量（例如，FID为0.61）。

Recent years have witnessed the dramatic growth of Internet video traffic, where the video bitstreams are often compressed and delivered in low quality to fit the streamer's uplink bandwidth. To alleviate the quality degradation, it comes the rise of Neural-enhanced Video Streaming (NVS), which shows great prospects for recovering low-quality videos by mostly deploying neural super-resolution (SR) on the media server. Despite its benefit, we reveal that current mainstream works with SR enhancement have not achieved the desired rate-distortion trade-off between bitrate saving and quality restoration, due to: (1) overemphasizing the enhancement on the decoder side while omitting the co-design of encoder, (2) limited generative capacity to recover high-fidelity perceptual details, and (3) optimizing the compression-and-restoration pipeline from the resolution perspective solely, without considering color bit-depth. Aiming at overcoming these limitations, we are the first to conduct an encoder-decoder (i.e., codec) synergy by leveraging the inherent visual-generative property of diffusion models. Specifically, we present the Codec-aware Diffusion Modeling (CaDM), a novel NVS paradigm to significantly reduce streaming delivery bitrates while holding pretty higher restoration capacity over existing methods. First, CaDM improves the encoder's compression efficiency by simultaneously reducing resolution and color bit-depth of video frames. Second, CaDM empowers the decoder with high-quality enhancement by making the denoising diffusion restoration aware of encoder's resolution-color conditions. Evaluation on public cloud services with OpenMMLab benchmarks shows that CaDM effectively saves up to 5.12 - 21.44 times bitrates based on common video standards and achieves much better recovery quality (e.g., FID of 0.61) over state-of-the-art neural-enhancing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题