中国科学技术大学学报 ›› 2021, Vol. 51 ›› Issue (10): 717-724.DOI: 10.52396/JUST-2020-0032

• 信息科学 •    下一篇

不稳定传输中受损视频的低延迟修复方法

魏俣童1, 鲍秉坤2, 张子祺3, 朱进1*   

  1. 1.中国科学技术大学自动化系,安徽合肥 230027;
    2.南京邮电大学通信与信息工程学院,江苏南京 210003;
    3.声网公司,上海 200082
  • 收稿日期:2020-12-24 修回日期:2021-02-05 出版日期:2021-10-31 发布日期:2022-01-11
  • 通讯作者: *E-mail:jinzhu@ustc.edu.cn

A low-latency inpainting method for unstably transmitted videos

WEI Yutong1, BAO Bingkun2, ZHANG Ziqi3, ZHU Jin1*   

  1. 1. Department of Automation, University of Science and Technology of China, Hefei 230027, China;
    2. College of Telecommunications Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;
    3. Agora,Inc., Shanghai 200082, China
  • Received:2020-12-24 Revised:2021-02-05 Online:2021-10-31 Published:2022-01-11
  • Contact: *E-mail: jinzhu@ustc.edu.cn

摘要: 视频流量已逐渐成为移动流量的重要组成部分,而不稳定传输中的视频缺损却仍然是一个亟待解决的问题.这种类型的视频缺损往往带有完全随机的特性,很难对其进行低延迟并且高精度的修复.我们率先关注了该不稳定传输中视频修复的任务,并提出了一种低延迟的视频修复方法,该方法包括两个阶段:在粗略修复阶段,先从参考帧中提取受损的二维光流图,再建立线性预测模型,根据运动在时间维度的连续性,来对受损帧进行初步的粗略修复.在精细修复阶段,提出了一个部分卷积神经网络(PCFC-Net),用于对所有参考信息进行综合并计算精细修复的结果.与基线相比,该方法在DAVIS数据集上的参考帧等待时间大大减少,同时PSNR和SSIM也提高了4.0%~12.7%.

关键词: 视频修复, 不稳定传输, 部分卷积神经网络, 线性预测

Abstract: Video traffic has gradually occupied the majority of mobile traffic, and video damage in unstable transmission remains a common and urgent problem. The difficulty of inpainting these damaged videos is that the holes randomly appear in random video frames, which are hard to be well settled with both low latency and high accuracy. We are the pioneer to look into the video inpainting task in unstable transmission and propose a low-latency video inpainting method which consists of two stages: In the coarsely inpainting stage, we achieve the extraction of damaged two-dimensional optical flow from reference frames, and establish a linear prediction model to coarsely inpaint the damaged frames according to the temporal consistency of motions. In the fine inpainting stage, a Partial Convolutional Frame Completion network(PCFC-Net) is proposed to synthesize all reference information and calculate a fine inpainting result. Compared with that of the state-of-the-art baselines, the waiting time for reference frames is greatly reduced while PSNR and SSIM are improved by 4.0%~12.7% on DAVIS dataset.

Key words: video inpainting, unstable transmission, partial CNN, linear prediction

中图分类号: